高效找到大图的近似最小生成树

Question

我有大量节点，约 25000 个，每个节点都有一个 3D 位置。

我想生成一个稀疏连接图，其边由节点之间的距离给出，用于 GNN。

查找最小生成树 (MST) 的算法通常依赖于首先从全连接图开始，然后删除节点来查找 MST。对于大图来说这非常慢。

为了加快速度，我尝试使用 scipy 的稀疏距离矩阵将初始连接的半径限制为最大最近邻距离，但这会导致某些图出现多个连接组件。

（此方法不起作用的较小图表的示例:)

这是我尝试过的：

import numpy as np
from scipy.sparse.csgraph import minimum_spanning_tree
from scipy.spatial import KDTree
from scipy.spatial import distance_matrix

# Create a random set of 25000 node positions
positions = np.random.normal(loc=500000, scale=10000, size=(25000, 3))

# Method 1: Full MST Search (Too Slow)
dist_matrix = distance_matrix(positions, positions)
mst = minimum_spanning_tree(dist_matrix)


# Method 2: Start from sparse matrix (Results in multiple connected components)
kdtree = KDTree(positions)

# Find the maximum nearest neighbour distance for the graph
distances, _ = kdtree.query(positions, k=2)
max_neighbour_distance = np.max(distances[:, 1])
max_neighbour_distance = np.ceil(max_neighbour_distance) # Round up to avoid floating point errors in MST search


# Threshold the distance matrix by this distance
sparse_dist_matrix = kdtree.sparse_distance_matrix(kdtree, max_neighbour_distance, output_type="coo_matrix")

mst = minimum_spanning_tree(sparse_dist_matrix)

G = nx.from_scipy_sparse_array(mst)

我不需要真正的最小生成树，只是为了让图与尽可能少的边连接以加速 GNN 性能。对于某些图来说，即使稀疏方法也太慢了。

我考虑了一种基于https://www.cs.umd.edu/~mount/Papers/soda16-emst.pdf的方法，但它看起来很难实现，即scipy没有四叉树。

将完全距离矩阵转换为 networkx 图，然后使用 Boruvka 算法的实现甚至更慢，它不适用于大型图。向 max_neighbor_distance 添加一个乘数将有助于确保只有一个连接的组件，但也会增加处理时间，并且并不总是足够的。

Answer 1

一组点的欧几里得距离最小生成树是Delaunay三角剖分的子图，它具有线性数量的边。

Scipy 有一种方法可以有效计算 Delaunay 三角剖分。

高效找到大图的近似最小生成树

问题描述投票：0回答：1

1个回答

最新问题

高效找到大图的近似最小生成树

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1