我正在尝试使用 DFS 和 BFS 来查找长度达到给定 k 的所有简单路径,从有向图中的给定顶点开始。不允许循环。
我的代码如下,我已经验证这两种算法都能产生正确的结果。
我定义了图类,并将BFS和DFS的代码放在这个类中。我需要的图中大约有 190,000 个顶点和 2,500,000 个边。每个顶点的平均出度为105:
public class AdjGraph {
private int V; // sum of vertices
private int E; // sum of edges
private List<Vertex> vertexList;
private Map<Vertex, List<Edge>> vertexAdj;
public AdjGraph() {
this.V = 0;
this.E = 0;
this.vertexList = new ArrayList<>();
this.vertexAdj = new HashMap<>();
}
public void addVertex(Vertex v) {
this.vertexList.add(v);
this.vertexAdj.put(v, new ArrayList<>());
this.V++;
}
public void addEdge(Edge e) {
Vertex startVertex = e.getStartVertex();
List<Edge> edgeList = this.vertexAdj.get(startVertex);
if (!edgeList.contains(e)) {
edgeList.add(e);
this.vertexAdj.put(startVertex, edgeList);
}
}
public List<Vertex> getVertexList() {
return vertexList;
}
// Version of DFS
public Map<Integer, List<List<Vertex>>> findAllPathsUpToLengthByDFS(Vertex start, int k) {
boolean[] visited = new boolean[vertexList.size()];
List<Vertex> path = new ArrayList<>();
Map<Integer, List<List<Vertex>>> allPaths = new HashMap<>();
findAllPathsUpToLengthByDFSUtil(start, k, visited, path, allPaths);
return allPaths;
}
// Used by the function findAllPathsUpToLengthByDFS(...)
private void findAllPathsUpToLengthByDFSUtil(Vertex u, int k, boolean[] visited, List<Vertex> path, Map<Integer, List<List<Vertex>>> allPaths) {
// Mark the current node as visited and store in path
visited[vertexList.indexOf(u)] = true;
path.add(u);
// If current path length exceeds k, backtrack
if (path.size() - 1 > k) {
visited[vertexList.indexOf(u)] = false;
path.remove(path.size() - 1);
return;
}
// If current path length is within k, add it to allPaths
int pathLength = path.size() - 1;
if (pathLength <= k) {
allPaths.computeIfAbsent(pathLength, x -> new ArrayList<>()).add(new ArrayList<>(path));
}
// Recur for all the vertices adjacent to this vertex
for (Edge edge : vertexAdj.get(u)) {
Vertex v = edge.getEndVertex();
if (!visited[vertexList.indexOf(v)]) {
findAllPathsUpToLengthByDFSUtil(v, k, visited, path, allPaths);
}
}
// Remove current vertex from path and mark it as unvisited
path.remove(path.size() - 1);
visited[vertexList.indexOf(u)] = false;
}
// Version of BFS
public Map<Integer, List<List<Vertex>>> findAllPathsUpToLengthByBFS(Vertex start, int k) {
Map<Integer, List<List<Vertex>>> allPaths = new HashMap<>();
Queue<List<Vertex>> queue = new LinkedList<>();
// Initialize the queue with the start vertex
List<Vertex> startPath = new ArrayList<>();
startPath.add(start);
queue.add(startPath);
while (!queue.isEmpty()) {
List<Vertex> path = queue.poll();
Vertex last = path.get(path.size() - 1);
// Add the path to allPaths if its length is within k
int pathLength = path.size() - 1;
if (pathLength <= k) {
allPaths.computeIfAbsent(pathLength, x -> new ArrayList<>()).add(new ArrayList<>(path));
}
// If the path length is already k, no need to explore further
if (pathLength < k) {
// Explore the neighbors of the last vertex in the path
for (Edge edge : vertexAdj.get(last)) {
Vertex next = edge.getEndVertex();
if (!path.contains(next)) { // Avoid cycles
List<Vertex> newPath = new ArrayList<>(path);
newPath.add(next);
queue.add(newPath);
}
}
}
}
return allPaths;
}
}
顶点类:
public class Vertex {
// The origin names are not id1 and id2.
// I change them for simplicity.
private String id1;
private String id2;
public Vertex() {
}
public Vertex(String id1, String id2) {
this.id1 = id1;
this.id2 = id2;
}
// Getters and Setters
@Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null || getClass() != obj.getClass()) {
return false;
}
Vertex other = (Vertex) obj;
return id1.equals(other.id1) && id2.equals(other.id2);
}
@Override
public int hashCode() {
return Objects.hash(id1, id2);
}
}
边缘类:
public class Edge {
private Vertex startVertex;
private Vertex endVertex;
public Edge(Vertex startVertex, Vertex endVertex) {
this.startVertex = startVertex;
this.endVertex = endVertex;
}
// Getters for startVertex and endVertex
public Vertex getStartVertex() {
return startVertex;
}
public Vertex getEndVertex() {
return endVertex;
}
@Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null || getClass() != obj.getClass()) {
return false;
}
Edge other = (Edge) obj;
return Objects.equals(startVertex, other.startVertex) &&
Objects.equals(endVertex, other.endVertex);
}
@Override
public int hashCode() {
return Objects.hash(startVertex, endVertex);
}
}
这是一个例子。本示例的数据量较小,因此运行时间差异并不明显。当图形很大时,差异就很明显。在我的图中,当 k = 1 时,BFS 运行 2 毫秒,而 DFS 运行 15 秒:
public static void main(String[] args) {
AdjGraph graph = new AdjGraph();
for (int i = 0; i < 1000; i++) {
Vertex v = new Vertex(String.valueOf(i), String.valueOf(i));
graph.addVertex(v);
}
List<Vertex> vertexList = graph.getVertexList();
for (int i = 0; i < vertexList.size(); i++) {
for (int j = 0; j < vertexList.size(); j++) {
if (j != i) {
graph.addEdge(new Edge(vertexList.get(i), vertexList.get(j)));
}
}
}
Vertex start = vertexList.get(0); // Start node
int k = 1; // Maximum path length
Long t1 = System.currentTimeMillis();
Map<Integer, List<List<Vertex>>> allPaths = graph.findAllPathsUpToLengthByBFS(start, k);
// Map<Integer, List<List<Vertex>>> allPaths = graph.findAllPathsUpToLengthByBFS(start, k);
Long t2 = System.currentTimeMillis();
System.out.println("Running time: " + (t2 - t1) + "ms");
}
令人惊讶的是,BFS 比 DFS 快得多。有人知道为什么吗?也许是因为DFS重复计算太多了?我该如何优化它们?
谢谢!
主要原因是,在 DFS 实现中,当您已经达到最大路径长度时,您仍然会寻找邻居。换句话说,您的基本案例出口比您所能做的更深一层,这意味着进行了大量迭代,而没有任何可能添加新路径。
可以通过更改此部分来解决此问题:
for (Edge edge : vertexAdj.get(u)) {
Vertex v = edge.getEndVertex();
if (!visited[vertexList.indexOf(v)]) {
findAllPathsUpToLengthByDFSUtil(v, k, visited, path, allPaths);
}
}
对此:
if (pathLength < k) { // *********
for (Edge edge : vertexAdj.get(u)) {
Vertex v = edge.getEndVertex();
if (!visited[vertexList.indexOf(v)]) {
findAllPathsUpToLengthByDFSUtil(v, k, visited, path, allPaths);
}
}
}
注意这正是您在 BFS 实现中使用的模式。 通过此更改,您可以删除您拥有的其他基本案例处理:
// If current path length exceeds k, backtrack
if (path.size() - 1 > k) {
visited[vertexList.indexOf(u)] = false;
path.remove(path.size() - 1);
return;
}
但是,您可以采取更多措施来改进 DFS 实施,尽管这些措施的效果不太显着:
visited[vertexList.indexOf(v)]
是一个相对较慢的操作。使用哈希集而不是数组会更有效。path
制作为有序哈希集 (LinkedHashSet
),这样您就可以删除 visited
并使用 path.contains
来实现此目的。这是实现这些更改的 DFS 实现版本。我在函数的末尾添加了“2”,以便您可以将它们包含在您的项目中并将其性能与原始函数进行比较:
// Improved Version of DFS (DFS2)
public Map<Integer, List<List<Vertex>>> findAllPathsUpToLengthByDFS2(Vertex start, int k) {
// Use an insertion-ordered set so you don't need a separate visited array
Set<Vertex> path = new LinkedHashSet<>();
Map<Integer, List<List<Vertex>>> allPaths = new HashMap<>();
findAllPathsUpToLengthByDFSUtil2(start, k, path, allPaths);
return allPaths;
}
private void findAllPathsUpToLengthByDFSUtil2(Vertex u, int k, Set<Vertex> path, Map<Integer, List<List<Vertex>>> allPaths) {
path.add(u);
int pathLength = path.size() - 1;
allPaths.computeIfAbsent(pathLength, x -> new ArrayList<>()).add(new ArrayList<>(path));
if (pathLength < k) { // This is where you should stop recursion
for (Edge edge : vertexAdj.get(u)) {
Vertex v = edge.getEndVertex();
if (!path.contains(v)) {
findAllPathsUpToLengthByDFSUtil2(v, k, path, allPaths);
}
}
}
path.remove(u);
}