关于算法：改进Python中迷宫求解程序的性能

Improvement of performance for a maze solving program in Python

各位程序员。我的一个项目需要帮助。我正在做一个迷宫解决方案。它读取的图像文件必须是黑白的(黑色像素是墙，白色像素是路径)，顶部只有一个像素是迷宫的入口，底部只有一个白色像素是出口。

代码有三个主要部分：

1)程序首先在迷宫中创建节点，遵循一组规则。例如，这里有一个简单的迷宫：

(well,

所有的节点都画成红色：

enter image description here

这些节点就像是转角、十字路口、每个可以改变方向的点。测量了每个节点与迷宫出口的距离。当它生成所有节点时，它将它们放在一个列表中。

2)一旦生成所有节点，程序将迭代列表中的所有节点，并尝试在每个可能的方向搜索其他节点，以"链接"它们，建立可能的路径。例如，如果它检测到一个节点上有一条路径，它将从该节点的坐标中搜索一行中的每个像素，然后向上搜索，再次遍历所有节点列表，以查看另一个节点是否匹配这些坐标。如果它在某个点上找到了一个，它会将它们链接起来，并开始向右搜索(当然，如果有向右的路径的话)，等等。

3)一旦所有的节点都连接在一起，并且建立了所有可能的路径，它将从迷宫的入口节点开始，运行我的A*算法实现，找出正确的路径，如果它在死胡同中，就返回。如你所见，它解决迷宫没有困难。

enter image description here

所以我的程序是有效的。那有什么问题？问题是节点链接部分。在小迷宫上，大约需要半秒钟。但是如果迷宫变大一些，那么节点的数量就会急剧增加。因为它经常迭代节点列表(它搜索每个节点的每个像素一次)，你可以想象如果我有60万个节点…这需要很长时间。

所以这就是我需要帮助的地方：一种更好、更快的将节点连接在一起的方法。我已经在Pastebin上发布了整个代码(https://pastebin.com/xtmtm7wb，对不起，如果有点混乱，我编程的时候已经很晚了)。节点连接部分从第133行开始，到第196行结束。

以下是节点链接代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

counter = 0
last = 0
for node in nodelist:
a = node.pos[0]
b = node.pos[1]
if node.paths[0]:
for i in range(b-1,0,-1):
if mazepix[a,i] == blackpix:
break
if any(x.pos == (a,i) for x in nodelist):
for iteration in nodelist:
if iteration.pos == (a,i):
indexofnodetolinkto = iteration.index
break
node.connections.append(indexofnodetolinkto)
# print("Linking node %d and node %d..."%(node.index, indexofnodetolinkto))
break

if node.paths[1]:
for i in range(a+1,maze.size[0]):
if mazepix[i,b] == blackpix:
break
if any(x.pos == (i,b) for x in nodelist):
for iteration in nodelist:
if iteration.pos == (i,b):
indexofnodetolinkto = iteration.index
break
node.connections.append(indexofnodetolinkto)
# print("Linking node %d and node %d..."%(node.index, indexofnodetolinkto))
break

if node.paths[2]:
for i in range(b+1,maze.size[1]):
if mazepix[a,i] == blackpix:
break
if any(x.pos == (a,i) for x in nodelist):
for iteration in nodelist:
if iteration.pos == (a,i):
indexofnodetolinkto = iteration.index
break
node.connections.append(indexofnodetolinkto)
# print("Linking node %d and node %d..."%(node.index, indexofnodetolinkto))
break

if node.paths[3]:
for i in range(a-1,0,-1):
if mazepix[i,b] == blackpix:
break
if any(x.pos == (i,b) for x in nodelist):
for iteration in nodelist:
if iteration.pos == (i,b):
indexofnodetolinkto = iteration.index
break
node.connections.append(indexofnodetolinkto)
# print("Linking node %d and node %d..."%(node.index, indexofnodetolinkto))
break

counter += 1
percent = (counter/nbrofnodes)*100
if int(percent)%10 == 0 and int(percent) != last:
print("Linking %d%% done..."%percent)
last = int(percent)

print("All node linked.")

谢谢你，如果你读了所有这些，我知道这不是一个非常精确的问题，但我花了很多时间试图使这项工作，现在我真的坚持在方法上，我可以改善它。

相关讨论

您的程序非常慢，因为这一部分需要很长时间，而且您对每个节点都要执行多次：

1
2
3
4

for iteration in nodelist:
if iteration.pos == (i,b):
indexofnodetolinkto = iteration.index
break

有很多方法可以使它更快。

您可以使用位置作为键将节点放入字典中，这样您只需查找一个位置即可在那里找到节点。

更好的是，您可以将节点放入行列表和列列表中，按位置排序，然后只尝试连接列表中的相邻节点。

但最好的办法是完全忘记这些节点，直接在位图上进行bfs搜索。

因为这是一个有趣的问题，我写了一个简单的bfs快速版本。我不想毁了你所有的乐趣，所以这里只是BFS的一部分，这样你就可以看到我所说的BFS直接在图像上：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

#Breadth-first search over the graph
#We use special colored pixels in the image to mark visited locations and their distance
nextlevel=[(entrypos,0)]
nextdist=0
mazepix[entrypos,0] = markerPixel(nextdist)
exitpos = -1
while exitpos<0:
if len(nextlevel) < 1:
print("Could not find exit!")
return
prevlevel = nextlevel
nextdist += 1
nextlevel = []
nextpix = markerPixel(nextdist)

for prevpos in prevlevel:
for dir in [(-1,0),(0,1),(1,0),(0,-1)]:
x = prevpos[0]+dir[0]
y = prevpos[1]+dir[1]
if x>=0 and y>=0 and x<W and y<H and mazepix[x,y]==whitepix:
nextlevel.append((x,y))
#mark it used and distance mod 3
mazepix[x,y]=nextpix
if y>=H-1:
exitpos=x

我们不使用带有对象和链接的单独集合来记住路径，而是将像素标记为直接在图像中访问的像素。我们不使用任何类型的实际链接将一个像素链接到另一个像素，只要在需要的时候检查所有四个方向寻找相邻的白色像素。

这是逐级的bfs，所以我们总是知道新像素离入口有多远，我们标记访问的像素的颜色表示它离入口的距离(mod 3)。这使我们能够在找到出口时重建最短的路径。

编辑：已经很长一段时间了，操作很有趣，下面是完整的Python解算器：

from PIL import Image
import math
import sys
import time
import pickle
import os

whitepix = (255,255,255)
blackpix = (0,0,0)
redpix = (255,0,0)
greenpix = (0,255,0)

def markerPixel(distance):
val=120+(distance%3)*40
return (val,val,0)

def smallerMarker(pixel):
return markerPixel(pixel[0]-1)

def isMarker(pixel):
return pixel[2]==0 and pixel[0]==pixel[1] and pixel[0]>=120

def solve(imagename, outputname, showmarkers):

maze = Image.open(imagename)
maze = maze.convert('RGB')
mazepix = maze.load()
nodelist = []

print(maze.size)

starttime = time.time()

W = maze.size[0]
H = maze.size[1]
entrypos = -1

# Find the entry
for i in range(0,W):
if mazepix[i, 0] == whitepix:
entrypos=i
break

if entrypos < 0:
print("No entry!")
return

#Breadth-first search over the graph
#We use special colored pixels in the image to mark visited locations and their distance
nextlevel=[(entrypos,0)]
nextdist=0
mazepix[entrypos,0] = markerPixel(nextdist)
exitpos = -1
while exitpos<0:
if len(nextlevel) < 1:
print("Could not find exit!")
return
prevlevel = nextlevel
nextdist += 1
nextlevel = []
nextpix = markerPixel(nextdist)

for prevpos in prevlevel:
for dir in [(-1,0),(0,1),(1,0),(0,-1)]:
x = prevpos[0]+dir[0]
y = prevpos[1]+dir[1]
if x>=0 and y>=0 and x<W and y<H and mazepix[x,y]==whitepix:
nextlevel.append((x,y))
#mark it used and distance mod 3
mazepix[x,y]=nextpix
if y>=H-1:
exitpos=x

#found the exit -- color the path green
nextpos = (exitpos,H-1)
while nextpos != None:
nextpix = smallerMarker(mazepix[nextpos[0],nextpos[1]])
prevpos = nextpos
mazepix[nextpos[0],nextpos[1]] = greenpix
nextpos = None
#find the next closest position -- all adjacent positions are either
#1 closer, 1 farther, or the same distance, so distance%3 is enough
#to distinguish them
for dir in [(-1,0),(0,1),(1,0),(0,-1)]:
x = prevpos[0]+dir[0]
y = prevpos[1]+dir[1]
if x>=0 and y>=0 and x<W and y<H and mazepix[x,y]==nextpix:
nextpos=(x,y)
break

#Erase the marker pixels if desired
if not showmarkers:
for y in range(0,H):
for x in range(0,W):
if isMarker(mazepix[x,y]):
mazepix[x,y]=whitepix

maze.save(outputname)

solve("maze.gif","solved.png", False)

你的迷宫只有301x301像素，所以我认为0.5秒对于解决方案来说太大了。当我使用光栅A*方法时：

如何在大空间尺度上加速*算法？

整个解决方案只使用了~1.873ms，这与您的~500ms有很大的不同。粗图A*有较大的开销，所以我很好奇，想要测试它，所以我编码了我的版本(在C++中基于上面链接中的相同代码)，结果是从图像获取的图形占据了EDCOX1×2，图A*占用了EDCOX1×3，所以与你的(+ /计算机设置差)仍然有很大的不同。

那么首先检查什么呢？

我不是Python编码人员，但在代码中看不到任何图像访问。您应该检查您的图像访问是否快速。这对于新手来说是一个常见的错误

1 2	GetPixel/PutPixel Pixels[][]

这些通常是痛苦的缓慢(在我的经验中，gdi win32比直接像素访问慢1000-10000倍)，如果更正的话会有很大的不同。有关详细信息，请参阅：

在C中显示颜色数组

使用列表的另一个常见错误是在不预先分配的情况下向列表中增量添加元素。对于小列表来说，这不是问题，但是对于大量的元素来说，在添加元素的情况下的重新分配通过一遍又一遍地复制这些元素来减慢速度。在列表中插入和删除元素也是如此。改进的列表访问对多项式复杂度(如O(n^2)和较慢的查询速度)有很大的影响。

算法

算法的微小变化会产生巨大的影响。在您的例子中，我使用了倾斜边缘检测技术和拓扑排序边缘的加速结构的组合。这将O(n)或O(n^2)搜索改为简单的O(1)操作。其思想是按xy和yx排序的所有迷宫顶点的顺序列表。如果每个顶点都知道它在这种结构中的索引，就可以很容易地得到它的相邻顶点…

堆/堆垃圾

这会减慢速度。尤其是递归函数。递归级别越大，操作数大小越大，传递的效果越差。

这里是基于上面链接的简单C++示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212

//---------------------------------------------------------------------------
//--- A star class ver: 1.00 ------------------------------------------------
//---------------------------------------------------------------------------
#ifndef _A_star_h
#define _A_star_h
//---------------------------------------------------------------------------
#include"list.h"
//---------------------------------------------------------------------------
class A_star_graph
{
public:
// variables
struct _pnt
{
int x,y; // 2D position (image)
int mx; // mxy[y][mx] index
int my; // mxy[x][my] index
int pN,pS,pE,pW; // index of linked point in direction or -1
int lN,lS,lE,lW; // distance to linked point in direction or 0 (cost for A*)
int a; // value for A*
_pnt() {}
_pnt(_pnt& a) { *this=a; }
~_pnt() {}
_pnt* operator = (const _pnt *a) { *this=*a; return this; }
//_pnt* operator = (const _pnt &a) { ...copy... return this; }
};
List<_pnt> pnt; // list of all vertexes
List< List<int> > mxy,myx; // xy and yx index sorted pnt[] (indexes of points)
List<int> path; // found path (indexes of points)

int xs,ys; // map esolution
DWORD col_space; // colors for rendering
DWORD col_wall ;
DWORD col_path ;

// internals
A_star_graph();
A_star_graph(A_star_graph& a) { *this=a; }
~A_star_graph(){}
A_star_graph* operator = (const A_star_graph *a) { *this=*a; return this; }
//A_star_graph* operator = (const A_star_graph &a) { ...copy... return this; }

// inteface
void reset(); // clear all
void ld(Graphics::TBitmap *bmp,DWORD col_wall); // create graph from bitmap col_wall is 0x00RRGGBB
void draw(Graphics::TBitmap *bmp); // draw map to bitmap for debuging
void compute(int p0,int p1); // compute path from pnt[p0] to pnt[p1] into path[]
};
//---------------------------------------------------------------------------
A_star_graph::A_star_graph()
{ //BBGGRR
col_space=0x00FFFFFF;
col_wall =0x00000000;
col_path =0x00FFAA40;
reset();
}
//---------------------------------------------------------------------------
void A_star_graph::reset()
{
int x,y; xs=0; ys=0; pnt.num=0; path.num=0;
for (x=0;x<mxy.num;x++) mxy[x].num=0; mxy.num=0;
for (y=0;y<myx.num;y++) myx[y].num=0; myx.num=0;
}
//---------------------------------------------------------------------------
void A_star_graph::ld(Graphics::TBitmap *bmp,DWORD col_wall)
{
_pnt p,*pp,*qq;
int i,j,x,y,c[10]={0,0,0,0,0,0,0,0,0,0};
DWORD *p0,*p1,*p2;
reset();
xs=bmp->Width;
ys=bmp->Height;
mxy.allocate(xs); mxy.num=xs; for (x=0;x<xs;x++) mxy[x].num=0;
myx.allocate(ys); myx.num=ys; for (y=0;y<ys;y++) myx[y].num=0;
if (!ys) return;
p.pN=-1; p.pS=-1; p.pE=-1; p.pW=-1; p.mx=-1; p.my=-1;
p0=NULL; p1=NULL; p2=(DWORD*)bmp->ScanLine[0];
for (p.y=0;p.y<ys;p.y++)
{
p0=p1; p1=p2; p2=NULL;
if (p.y+1<ys) p2=(DWORD*)bmp->ScanLine[p.y+1];
for (p.x=0;p.x<xs;p.x++)
if ((p1[p.x]&0x00FFFFFF)!=col_wall) // ignore wall pixels
{
// init connection info
p.lN=0; p.lS=0; p.lE=0; p.lW=0;
// init c[] array with not a wall predicates for 4-neighbors
c[2]=0; c[4]=0; c[5]=1; c[6]=0; c[8]=0;
if (p0) if ((p0[p.x]&0x00FFFFFF)!=col_wall) c[8]=1;
if (p2) if ((p2[p.x]&0x00FFFFFF)!=col_wall) c[2]=1;
if (p1)
{
if (p.x-1> 0) if ((p1[p.x-1]&0x00FFFFFF)!=col_wall) c[4]=1;
if (p.x+1<xs) if ((p1[p.x+1]&0x00FFFFFF)!=col_wall) c[6]=1;
}
// detect vertex and its connection
i=0;
if (( c[2])&&(!c[8])){ i=1; p.lS=1; } // L
if ((!c[2])&&( c[8])){ i=1; p.lN=1; }
if (( c[4])&&(!c[6])){ i=1; p.lW=1; }
if ((!c[4])&&( c[6])){ i=1; p.lE=1; }
j=c[2]+c[4]+c[6]+c[8];
if (j==3) // T
{
i=1; p.lN=1; p.lS=1; p.lW=1; p.lE=1;
if (!c[2]) p.lS=0;
if (!c[8]) p.lN=0;
if (!c[6]) p.lE=0;
if (!c[4]) p.lW=0;
}
if (j==4) // +
{
i=1; p.lN=1; p.lS=1; p.lW=1; p.lE=1;
}
// add point
if (i)
{
p.mx=myx[p.y].num;
p.my=mxy[p.x].num;
mxy[p.x].add(pnt.num);
myx[p.y].add(pnt.num);
pnt.add(p);
}
}
}
// find connection between points
for (pp=pnt.dat,i=0;i<pnt.num;i++,pp++)
{
if (pp->lE)
{
j=myx[pp->y][pp->mx+1]; qq=pnt.dat+j; pp->pE=j; qq->pW=i;
j=abs(qq->x-pp->x)+abs(qq->y-pp->y); pp->lE=j; qq->lW=j;
}
if (pp->lS)
{
j=mxy[pp->x][pp->my+1]; qq=pnt.dat+j; pp->pS=j; qq->pN=i;
j=abs(qq->x-pp->x)+abs(qq->y-pp->y); pp->lS=j; qq->lN=j;
}
}
}
//---------------------------------------------------------------------------
void A_star_graph::draw(Graphics::TBitmap *bmp)
{
int i;
_pnt *p0,*p1;
// init
bmp->SetSize(xs,ys);
// clear (walls)
bmp->Canvas->Pen->Color=TColor(col_wall);
bmp->Canvas->Brush->Color=TColor(col_wall);
bmp->Canvas->FillRect(TRect(0,0,xs,ys));
// space
bmp->Canvas->Pen->Color=TColor(col_space);
for (p0=pnt.dat,i=0;i<pnt.num;i++,p0++)
{
if (p0->pN>=0){ p1=pnt.dat+p0->pN; bmp->Canvas->MoveTo(p0->x,p0->y); bmp->Canvas->LineTo(p1->x,p1->y); }
if (p0->pS>=0){ p1=pnt.dat+p0->pS; bmp->Canvas->MoveTo(p0->x,p0->y); bmp->Canvas->LineTo(p1->x,p1->y); }
if (p0->pE>=0){ p1=pnt.dat+p0->pE; bmp->Canvas->MoveTo(p0->x,p0->y); bmp->Canvas->LineTo(p1->x,p1->y); }
if (p0->pW>=0){ p1=pnt.dat+p0->pW; bmp->Canvas->MoveTo(p0->x,p0->y); bmp->Canvas->LineTo(p1->x,p1->y); }
}
// found path
bmp->Canvas->Pen->Color=TColor(col_path);
for (i=0;i<path.num;i++)
{
p0=pnt.dat+path.dat[i];
if (!i) bmp->Canvas->MoveTo(p0->x,p0->y);
else bmp->Canvas->LineTo(p0->x,p0->y);
}
}
//---------------------------------------------------------------------------
void A_star_graph::compute(int p0,int p1)
{
_pnt *pp,*qq;
int i,a,e;
List<int> upd; // list of vertexes to update
// init
path.num=0;
if ((p0<0)||(p0>=pnt.num)) return;
if ((p1<0)||(p1>=pnt.num)) return;
// clear with max value
for (pp=pnt.dat,i=0;i<pnt.num;i++,pp++) pp->a=0x7FFFFFFF;
// init A* to fill from p1
upd.allocate(xs+ys); upd.num=0; // preallocate
upd.add(p1); pnt[p1].a=0; // start from p1
// iterative A* filling
for (e=1;(e)&&(upd.num);) // loop until hit the start p0 or dead end
{
// process/remove last pnt in que
pp=pnt.dat+upd[upd.num-1]; upd.num--;
// link exist? compute cost if less update it reached p0?
i=pp->pN; if (i>=0){ qq=pnt.dat+i; a=pp->a+pp->lN; if (qq->a>a) { qq->a=a; upd.add(i); } if (i==p0) { e=0; break; }}
i=pp->pS; if (i>=0){ qq=pnt.dat+i; a=pp->a+pp->lS; if (qq->a>a) { qq->a=a; upd.add(i); } if (i==p0) { e=0; break; }}
i=pp->pE; if (i>=0){ qq=pnt.dat+i; a=pp->a+pp->lE; if (qq->a>a) { qq->a=a; upd.add(i); } if (i==p0) { e=0; break; }}
i=pp->pW; if (i>=0){ qq=pnt.dat+i; a=pp->a+pp->lW; if (qq->a>a) { qq->a=a; upd.add(i); } if (i==p0) { e=0; break; }}
}
// reconstruct path
e=p0; pp=pnt.dat+e; path.add(e);
for (;e!=p1;) // loop until path complete
{
a=0x7FFFFFFF; e=-1;
// e = select link with smallest cost
i=pp->pN; if (i>=0){ qq=pnt.dat+i; if (qq->aa; e=i; }}
i=pp->pS; if (i>=0){ qq=pnt.dat+i; if (qq->aa; e=i; }}
i=pp->pE; if (i>=0){ qq=pnt.dat+i; if (qq->aa; e=i; }}
i=pp->pW; if (i>=0){ qq=pnt.dat+i; if (qq->aa; e=i; }}
if (e<0) break; // dead end
pp=pnt.dat+e; path.add(e);
}
}
//---------------------------------------------------------------------------
#endif
//---------------------------------------------------------------------------

用途：

1
2
3
4
5
6
7
8

Graphics::TBitmap *maze=new Graphics::TBitmap;
maze->LoadFromFile("maze.bmp");
maze->HandleType=bmDIB;
maze->PixelFormat=pf32bit;
A_star_graph map;
map.ld(maze,0);
map.compute(0,map.pnt.num-1);
map.draw(maze);

代码是基于VCL的(使用第二个链接中描述的位图)，我也使用我的动态列表模板，因此：

List xxx;与double xxx[];相同xxx.add(5);在清单末尾加上5。xxx[7]访问数组元素(safe)xxx.dat[7]访问数组元素(不安全但快速直接访问)xxx.num是数组的实际使用大小xxx.reset()清除数组并设置xxx.num=0。xxx.allocate(100)为100项预分配空间

抱歉，不是Python编码人员，但认为代码已经足够直接了…所以我希望你能适应你的环境。

此处输出：

result

看起来像是模仿你的…代码仍然没有优化，可以进一步改进…我想你应该仔细看看mx,my和mxy[][],myx[][]变量。这些是拓扑索引排序的顶点，可以大大加速对您的代码…

[编辑1]

我更新了A*搜索代码(THX到Matt Timmermans)，所以这里是更新的结果：

small big