如何对不相交的集合进行分割操作？

Question

我正在研究一个无向图，它支持：

```
addPerson
```
：插入节点
```
person
```
。
```
addRelation
```
：在
```
id1
```
和
```
id2
```
之间添加一条边。
```
modifyRelation
```
：修改/删除
```
id1
```
和
```
id2
```
之间的边缘。
```
isCircle
```
：检查
```
id1
```
和
```
id2
```
是否已连接。
```
queryBlockSum
```
：查询可以分为多少个不相交的集合。

public class MyNetwork implements Network {
    /* ... */
    private final DisjointSet disjointSet;

    @Override
    public void addPerson(Person person) {
        /* ... */
        disjointSet.add(person.getId());
    }

    @Override
    public void addRelation(int id1, int id2, int value) {
        /* ... */
        disjointSet.merge(id1, id2);
    }

    @Override
    public void modifyRelation(int id1, int id2, int value) {
        /* ... */
        int parent = disjointSet.find(id1);
        if (/* I should remove the edge */) {
            HashSet<Integer> set1 = /* all nodes connected with id1 */;
            HashSet<Integer> set2 = /* all nodes connected with id2 */;
            if (!set1.contains(parent)) {
                disjointSet.resetParent(set1, parent, id1);
            }
            if (!set2.contains(parent)) {
                disjointSet.resetParent(set2, parent, id2);
            }
        }
    }

    @Override
    public boolean isCircle(int id1, int id2) {
        return disjointSet.find(id1) == disjointSet.find(id2);
    }

    @Override
    public int queryBlockSum() {
        return disjointSet.getSetCount();
    }
}

为了提高效率，我实现了不相交集数据结构。它支持：

```
add
```
：添加元素
```
x
```
。
```
find
```
：查找元素
```
x
```
的父元素（带路径压缩）。
```
merge
```
：合并两组
```
x
```
和
```
y
```
。
```
split
```
：将
```
c
```
中所有元素的父元素从
```
oldParent
```
更改为
```
newParent
```
，将一组分成两部分。（可以保证对于
```
x
```
中的任何元素
```
c
```
，
```
find(x) == oldParent
```
）
```
getSetCount
```
：获取组数。

public class DisjointSet {
    private final HashMap<Integer, Integer> parent;
    private int setCount;

    public DisjointSet() {
        this.parent = new HashMap<>();
        this.setCount = 0;
    }

    public int getSetCount() {
        return setCount;
    }

    public void add(int x) {
        parent.put(x, x);
        setCount++;
    }

    public int find(int x) {
        int parentX = parent.get(x);
        if (parentX == x) {
            return x;
        }
        parentX = find(parentX);
        parent.put(x, parentX);
        return parentX;
    }

    public boolean merge(int x, int y) {
        int parentX = find(x);
        int parentY = find(y);
        if (parentX != parentY) {
            parent.put(parentX, parentY);
            setCount--;
            return true;
        }
        return false;
    }

    public void split(Collection<Integer> c, int oldParent, int newParent) {
        for (Integer x : c) {
            parent.put(x, newParent);
        }
        setCount++;
    }
}

问题是，多次合并和拆分后，不相交的集合变得有问题：我发现两个元素不在同一个集合中（即具有不同的父元素），而它们应该是在同一个集合中。我尝试在分割之前更新

中所有元素的父级，但它不起作用：

    public void split(Collection<Integer> c, int oldParent, int newParent) {
        // This doesn't fix the bug.
        for (Integer x : c) {
            find(x);
        }
        for (Integer x : c) {
            parent.put(x, newParent);
        }
        setCount++;
    }

但是，如果我在拆分操作之前更新所有元素的父级，则错误得到修复：

    public void split(Collection<Integer> c, int oldParent, int newParent) {
        // This fixes the bug.
        for (Integer x : parent.keySet()) {
            find(x);
        }
        for (Integer x : c) {
            parent.put(x, newParent);
        }
        setCount++;
    }

我不明白为什么我需要更新所有元素的父级，即使它们不需要拆分。

对不相交集执行分割操作的正确方法是什么？或者我是否必须重建不相交集，因为它根本不支持拆分？

Answer 1

假设您有三个值 - 1、2 和 3 - 并且您合并了 1 和 2。这意味着前两个值之一是这两个值中另一个的父级，让我们假设 2 成为1. 如果现在您仅使用该对的 parent 创建一个集合，并调用

split

来移动它以使其与 3 合并，那么您也将 drag 将值 1 拖入该移动中，因为它通过其父链接附加到 2：

        DisjointSet dset = new DisjointSet();
        dset.add(1);
        dset.add(2);
        dset.add(3);
        dset.merge(1, 2);
        Collection<Integer> c = new HashSet<Integer>();
        c.add(2);
        dset.split(c, 2, 3);
        for (int i = 1; i <= 3; i++) {
            System.out.println(i + " is in same set as " + dset.find(i));
        }

该程序将输出：

1 is in same set as 3
2 is in same set as 3
3 is in same set as 3

可视化：

分裂前：

     2   3
     ↑
     1

分裂后：

     2 → 3
     ↑
     1

如您所见，移动节点的后代将随之移动，即使这些后代可能不是

的成员。

不相关的问题

您的合并逻辑不会努力保持所涉及的树平衡。维基百科提到了合并两个集合：

选择哪个节点成为父节点会影响树上未来操作的复杂性。如果不小心，树木会变得过高。

这就是你的情况。您应该按大小或按排名算法实现并集。例如，您可以在 Node 实例中存储和维护集合的大小，如下所述。
```
split
```
方法不需要第二个参数。没用过，也没有用。
```
split
```
方法不应增加
```
setCount
```
，因为此操作仅扩展
```
newParent
```
已属于的集合。它不会创造它。
另一方面，如果分割后留下一个已变空的集合，则
```
split
```
可以减少集合的数量——它拥有的所有节点都被“分割”出来。所以有时应该减少
```
setCount
```
。
可以通过防止插入重复值来改进
```
add
```
方法。
我不会将
```
find
```
设为公共方法，而是提供一个返回所有不相交集的方法。您通常只会在较大的算法结束时调用该方法，以获得一些最终结果。

如何做到

一个想法是用 Node 实例表示值，该实例具有

parent

字段。然后，当您想要执行拆分时，可以保持该 Node 实例不变，但为移动值创建一个新实例，并在 HashMap 中引用 that 一个。较旧的节点保证其子节点仍将连接，但由于它不再在 HashMap 中引用，因此它不再表示值。

这是这个想法的实现：

import java.util.*;

public class DisjointSet {
    private class Node {
        Node parent;
        int size; // To allow for balancing the tree

        Node() {
            parent = this; // Self-reference
            size = 1;
        }
    }

    // Use the concept of nodes
    private final HashMap<Integer, Node> nodes;
    private int setCount;

    public DisjointSet() {
        this.nodes = new HashMap<Integer, Node>();
        this.setCount = 0;
    }

    public int getSetCount() {
        return setCount;
    }

    public void add(int x) {
        if (!nodes.containsKey(x)) { // Don't insert duplicates
            nodes.put(x, new Node());
            setCount++;
        }
    }

    // Useful method to get a final view on the sets
    public Collection<Collection<Integer>> getSets() {
        HashMap<Node, Collection<Integer>> sets = new HashMap<>();
        for (int key : nodes.keySet()) {
            Node node = find(nodes.get(key));
            if (!sets.containsKey(node)) {
                sets.put(node, new ArrayList<Integer>());
            }
            sets.get(node).add(key);
        }
        return sets.values();
    }
    

    // Make this private
    private Node find(Node node) {
        if (node.parent != node) {
            node.parent = find(node.parent);
        }
        return node.parent;
    }

    // Could be a useful method
    public boolean areInSameSet(int x, int y) {
        return find(nodes.get(x)) == find(nodes.get(y));
    }
    
    public boolean merge(int x, int y) {
        return merge(nodes.get(x), nodes.get(y));
    }
    
    private boolean merge(Node nodeX, Node nodeY) {
        if (nodeX == nodeY) { // Already merged
            return false;
        }
        nodeX = find(nodeX);
        nodeY = find(nodeY);
        if (nodeX == nodeY) { // Already merged
            return false;
        }
        // Apply some tree balancing: union by size
        if (nodeX.size < nodeY.size) {
            nodeX.parent = nodeY;
            nodeY.size += nodeX.size;
        } else {
            nodeY.parent = nodeX;
            nodeX.size += nodeY.size;
        }
        setCount--;
        return true;
    }

    public void split(int x, int newParent) {
        split(x, find(nodes.get(newParent)));
    }
    
    private void split(int x, Node target) {
        Node node = nodes.get(x);
        Node root = find(node);
        if (root == target) {
            return; // Already in target
        }
        // If it was the last member in the set...
        if (--root.size == 0) {
            setCount--;
        } 
        // Create a new node for this value
        node = new Node();
        nodes.put(x, node);
        merge(node, target);
    }

    public void split(Collection<Integer> c, int newParent) {
        Node target = find(nodes.get(newParent)); 
        for (Integer x : c) {
            split(x, target);
        }
        // setCount doesn't increase, because newParent was already a set
    }
}

你可以这样使用它：

        DisjointSet dset = new DisjointSet();
        dset.add(1);
        dset.add(2);
        dset.add(3);
        dset.merge(1, 2);
        Collection<Integer> c = new HashSet<Integer>();
        c.add(2);
        dset.split(c, 3);
        for (Collection<Integer> values : dset.getSets()) {
            System.out.println("set: " + values);
        }

将输出：

set [1]
set [2, 3]

如何对不相交的集合进行分割操作？

问题描述投票：0回答：1

1个回答

不相关的问题

如何做到

最新问题

如何对不相交的集合进行分割操作？

问题描述 投票：0回答：1

1个回答

不相关的问题

如何做到

最新问题

问题描述投票：0回答：1