关于算法：N*M 网格中按字典顺序排列的最小路径

Lexographically smallest path in a N*M grid

我在最近的一次采访中遇到了这个问题。
给定一个由数字组成的 N*M 网格，网格中的路径是您遍历的节点。给定一个约束，我们只能在网格中向右或向下移动。因此，给定这个网格，我们需要找到排序后的字典最小路径，从网格的左上角到右下角点
例如。如果网格是 2*2

4 3

5 1
那么根据问题的字典最小路径是"1 3 4"。
遇到这样的问题怎么办？代码表示赞赏。提前致谢。

相关讨论

您可以使用动态编程来解决这个问题。令 f(i, j) 为从 (i, j) 到 (N, M) 仅向右和向下移动的最小字典路径(对路径进行排序后)。考虑以下重复：

1	f(i, j) = sort( a(i, j) + smallest(f(i + 1, j), f(i, j + 1)))

其中 a(i, j) 是网格中 (i, j) 处的值，smallest (x, y) 返回 x 和 y 之间较小的字典字符串。 + 连接两个字符串，sort(str) 按词法顺序对字符串 str 进行排序。

重复的基本情况是：

1	f(N, M) = a(N, M)

i = N 或 j = M 时的重复频率也会发生变化(确保你看到了)。

考虑以下用 C++ 编写的代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

//-- the 200 is just the array size. It can be modified

string a[200][200]; //-- represent the input grid
string f[200][200]; //-- represent the array used for memoization
bool calculated[200][200]; //-- false if we have not calculate the value before, and true if we have
int N = 199, M = 199; //-- Number of rows, Number of columns

//-- sort the string str and return it
string srt(string &str){
sort(str.begin(), str.end());
return str;
}

//-- return the smallest of x and y
string smallest(string & x, string &y){
for (int i = 0; i < x.size(); i++){
if (x[i] < y[i]) return x;
if (x[i] > y[i]) return y;
}
return x;
}

string solve(int i, int j){
if (i == N && j == M) return a[i][j]; //-- if we have reached the buttom right cell (I assumed the array is 1-indexed
if (calculated[i][j]) return f[i][j]; //-- if we have calculated this before
string ans;
if (i == N) ans = srt(a[i][j] + solve(i, j + 1)); //-- if we are at the buttom boundary
else if (j == M) ans = srt(a[i][j] + solve(i + 1, j)); //-- if we are at the right boundary
else ans = srt(a[i][j] + smallest(solve(i, j + 1), solve(i + 1, j)));
calculated[i][j] = true; //-- to fetch the calculated result in future calls
f[i][j] = ans;
return ans;
}

string calculateSmallestPath(){
return solve(1, 1);
}

相关讨论

如果没有数字重复，也可以在O (NM log (NM))中实现。

直觉：

假设我将左上角 (a,b) 和右下角 (c,d) 的网格标记为 G(a,b,c,d)。由于您必须在对路径进行排序后获得按字典顺序排列的最小字符串，因此目标应该是每次在 G 中找到最小值。如果达到了这个最小值，比如说，(i,j)，那么 G(i,b,c,j) 和 G(a,j,i,d) 对于我们的下一个最小值(对于路径)的搜索将变得无用。也就是说，我们想要的路径的值永远不会在这两个网格中。证明？如果遍历这些网格中的任何位置，我们都不会达到 G(a,b,c,d) 中的最小值((i,j) 处的那个)。而且，如果我们避免 (i,j)，我们构建的路径不可能是字典最小的。

enter

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

for each val in sorted key set S do
(i,j) <- Dict(val)
Grid G <- Root(T)
do while (i,j) in G
if G has no child do
G.left <- G(a,b,i,j)
G.right <- G(i,j,c,d)
else if (i,j) in G.left
G <- G.left
else if (i,j) in G.right
G <- G.right
else
dict(val) <- null
end do
end if-else
end do
end for
for each val in G(1,1,m,n)
if dict(val) not null
solution.append(val)
end if
end for
return solution

Java 代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

class Grid{
int a, b, c, d;
Grid left, right;
Grid(int a, int b, int c, int d){
this.a = a;
this.b = b;
this.c = c;
this.d = d;
left = right = null;
}
public boolean isInGrid(int e, int f){
return (e >= a && e <= c && f >= b && f <= d);
}
public boolean hasNoChild(){
return (left == null && right == null);
}
}
public static int[] findPath(int[][] arr){
int row = arr.length;
int col = arr[0].length;
int[][] index = new int[row*col+1][2];
HashMap<Integer,Point> map = new HashMap<Integer,Point>();
for(int i = 0; i < row; i++){
for(int j = 0; j < col; j++){
map.put(arr[i][j], new Point(i,j));
}
}
Grid root = new Grid(0,0,row-1,col-1);
SortedSet<Integer> keys = new TreeSet<Integer>(map.keySet());
for(Integer entry : keys){
Grid temp = root;
int x = map.get(entry).x, y = map.get(entry).y;
while(temp.isInGrid(x, y)){
if(temp.hasNoChild()){
temp.left = new Grid(temp.a,temp.b,x, y);
temp.right = new Grid(x, y,temp.c,temp.d);
break;
}
if(temp.left.isInGrid(x, y)){
temp = temp.left;
}
else if(temp.right.isInGrid(x, y)){
temp = temp.right;
}
else{
map.get(entry).x = -1;
break;
}
}

}
int[] solution = new int[row+col-1];
int count = 0;
for(int i = 0 ; i < row; i++){
for(int j = 0; j < col; j++){
if(map.get(arr[i][j]).x >= 0){
solution[count++] = arr[i][j];
}
}
}
return solution;
}

空间复杂度由维护字典 - O(NM) 和树 - O(N+M) 构成。总体：O(NM)

填充然后排序字典的时间复杂度 - O(NM log(NM));用于检查每个 NM 值的树 - O(NM log(N+M))。总体 - O(NM log(NM)).

当然，如果值重复，这将不起作用，因为那时我们将有多个 (i,j) 用于网格中的单个值，并且选择将不再满足贪婪的决定接近。

其他仅供参考：与我之前听到的类似问题有一个额外的网格属性 - 没有重复的值，数字来自 1 to NM。在这种情况下，复杂性可能会进一步降低到 O(NM log(N+M))，因为您可以简单地将网格中的值用作数组的索引，而不是字典(这不需要排序。)