BFS求最短路径的两种实现方法，哪一种是明显的赢家？

Question

有两种实现 BFS 来查找两个节点之间的最短路径的方法。第一种是使用列表的列表来表示路径队列。另一种是维护每个节点到其父节点的映射，在检查相邻节点时，记录其父节点，最后根据父节点映射进行回溯以找到路径。（有关更多详细信息，请参阅这篇文章。https://stackoverflow.com/a/8922151/13886907。感谢乔对此问题的回答和代码！）

将它们复制到此处：第一种方式：

def bfs(graph, start, end):
    # maintain a queue of paths
    queue = []
    # push the first path into the queue
    queue.append([start])
    while queue:
        # get the first path from the queue
        path = queue.pop(0)
        # get the last node from the path
        node = path[-1]
        # path found
        if node == end:
            return path
        # enumerate all adjacent nodes, construct a 
        # new path and push it into the queue
        for adjacent in graph.get(node, []):
            new_path = list(path)
            new_path.append(adjacent)
            queue.append(new_path)

print(bfs(graph, 'A', 'F'))

第二种方式：

def backtrace(parent, start, end):
    path = [end]
    while path[-1] != start:
        path.append(parent[path[-1]])
    path.reverse()
    return path
        

def bfs(graph, start, end):
    parent = {}
    queue = []
    queue.append(start)
    while queue:
        node = queue.pop(0)
        if node == end:
            return backtrace(parent, start, end)
        for adjacent in graph.get(node, []):
            if node not in queue :
                parent[adjacent] = node # <<<<< record its parent 
                queue.append(adjacent)

print(bfs(graph, 'A', 'F'))

和图（有向图）

graph = {'A': ['C', 'D', 'B'],
        'B': ['C', 'E'],
        'C': ['E'],
        'D': ['F'],
        'E': ['F']}

我们可以看到，第二种方式可以节省内存成本，因为队列不需要存储路径，并且队列和父图的空间复杂度都是 O(V)，其中 V 是顶点数。而且，最终的回溯过程最多花费额外的 O(V) 时间。

那么，在寻找有向图中两个节点之间的最短路径或所有路径方面，第二种方式是否在各方面都优于第一种方式呢？我们是否可以将第二种视为对 BFS 基础版本（第一种方式）的优化？

Answer 1

第二个版本更好。这是因为内存分配也需要时间。即这一行：

new_path = list(path)

...就

path

的长度而言，时间复杂度为 O(k)。即使在“最好”的情况下，图实际上只是从源节点到目标节点的单个路径，第一个代码也将花费 O(1) + O(2) + O(3) + ... + O(n ）执行此 list(path) 调用，其时间复杂度为 O(n²)。在这种“快乐路径”情况下，第二个版本的复杂度为 O(n)。当图中的分支因子变大时，事情只会变得更糟。

备注您的代码

两个代码片段都有问题：

似乎
有这样的保护，但还不够好。它检查下一个节点是否已在队列中。但即使不是，以前也可能是，而且在这种情况下，也不应该重新审视它。我们可以使用 parent 来知道该节点是否已经被访问过。

def bfs_1(graph, start, end): queue = [] visited = set() queue.append([start]) visited.add(start) while queue: path = queue.pop(0) node = path[-1] if node == end: return path for adjacent in graph.get(node, []): if adjacent not in visited: visited.add(adjacent) new_path = list(path) new_path.append(adjacent) queue.append(new_path)

还有

def backtrace(parent, start, end): path = [end] while path[-1] != start: path.append(parent[path[-1]]) path.reverse() return path def bfs_2(graph, start, end): parent = {} queue = [] queue.append(start) parent[start] = None while queue: node = queue.pop(0) if node == end: return backtrace(parent, start, end) for adjacent in graph.get(node, []): if adjacent not in parent: parent[adjacent] = node # <<<<< record its parent queue.append(adjacent)

比较

我使用以下测试代码来测试上述算法：

import random from timeit import timeit def create_graph(size): graph = {} nodes = list(range(size)) for i in nodes: graph[i] = set(random.choices(nodes, k=3)) if i in graph[i]: graph[i].remove(i) graph[i] = list(graph[i]) return graph graph = create_graph(40000) print("version 1") print(bfs_1(graph, 1, 2)) print("time used", timeit(lambda: bfs_1(graph, 1, 2), number=10)) print() print("version 2") print(bfs_2(graph, 1, 2)) print("time used", timeit(lambda: bfs_2(graph, 1, 2), number=10))

生成的图有 100 000 个节点，分支因子约为 2。边是随机的。大多数时候，第二种算法比第一种算法更快。当解决方案路径更长时，这种差异变得更加明显。

BFS求最短路径的两种实现方法，哪一种是明显的赢家？

问题描述投票：0回答：1

1个回答

两个代码片段都有问题：

我使用以下测试代码来测试上述算法：

最新问题

BFS求最短路径的两种实现方法，哪一种是明显的赢家？

问题描述 投票：0回答：1

1个回答

两个代码片段都有问题：

我使用以下测试代码来测试上述算法：

最新问题

问题描述投票：0回答：1