使用Kruskal算法检测图形中的循环

Question

I'm implementing Kruskal's algorithm, which is a well-known approach to finding the minimum spanning tree of a weighted graph. 我正在实现Kruskal算法，这是一种众所周知的方法来查找加权图的最小生成树。 However, I am adapting it to find cycles in a graph. 但是，我正在调整它以在图表中查找周期。 This is the pseudocode for Kruskal's algorithm: 这是Kruskal算法的伪代码：

KRUSKAL(G):
1 A = ∅
2 foreach v ∈ G.V:
3    MAKE-SET(v)
4 foreach (u, v) ordered by weight(u, v), increasing:
5    if FIND-SET(u) ≠ FIND-SET(v):
6       A = A ∪ {(u, v)}
7       UNION(u, v)
8 return A

I'm having a hard time grasping the FIND-SET() and MAKE-SET() functions, or their implementation with the disjoint-set data structure. 我很难掌握FIND-SET()和MAKE-SET()函数，或者用不相交的数据结构实现它们的实现。

My current code looks like this: 我当前的代码如下所示：

class edge {
    public:      //for quick access (temp) 
      char leftV;
      char rightV;
      int weight;
};

std::vector<edge> kruskalMST(std::vector<edge> edgeList){
    std::vector<char> set;
    std::vector<edge> result;
    sortList(edgeList);    //sorts according to weight ( passed by reference)
    do{
        if(set.empty()){
            set.push_pack(edgeList[i].leftV);    //also only push them when
            set.push_pack(edgeList[i].rightV);    //they aren't there , will fix
            result.push_back(edgeList[i]);
            ++i;
        }
        else {
            if((setContains(set , edgeList[i].leftV)) && (setContains(set , edgeList[i].rightV)))
                ++i; //skip node 
            else {
                set.push_pack(edgeList[i].leftV);    //also only push them when
                set.push_pack(edgeList[i].rightV);    //they aren't there , will fix
                result.push_back(edgeList[i]);
                ++i;
            }
     } while(i<edgeList.size());
    return result;
}

My code detects a cycle in a graph when two vertices which are already present in set vector appear again. 当两个已经存在于set vector顶点再次出现时，我的代码检测到图形中的循环。 This seemed to work in most cases until I encountered a situation like this: 在大多数情况下，这似乎有效，直到我遇到这样的情况：

  [a]              [c]
   |                |
   |                |
   |                |
  [b]              [d]

When these edges appear in sorting order, this happens because a , b , c , d have already been pushed into set vector . 当这些边缘按排序顺序出现时，会发生这种情况，因为a ， b ， c ， d已被推入set vector 。 Joining [a] to [c] doesn't produce a cycle inside the graph but is detected as a cycle due to current implementation. 连接[a]到[c]不会在图形内部产生循环，但由于当前实现而被检测为循环。

Is there any viable alternative to detect cycles in my case? 在我的案例中，有没有可行的替代方法来检测周期？ Or if someone could explain how MAKE-SET , FIND-SET , and UNION work in Kruskal's algorithm, that would help a lot. 或者如果有人能解释MAKE-SET ， FIND-SET和UNION在Kruskal的算法中工作，那将会有很大帮助。

Answer 1

MAKE-SET(v) means that you're initializing a set consisting of only the vertex v . MAKE-SET(v)表示您正在初始化仅由顶点v组成的集合。 Initially, each vertex is in a set on its own. 最初，每个顶点都在一个集合中。

FIND-SET(u) is a function that tells you which set a vertex belongs to. FIND-SET(u)是一个函数，它告诉你顶点属于哪个集合。 It must return a pointer or an ID number that uniquely identifies the set. 它必须返回唯一标识该集合的指针或ID号。

UNION(u, v) means that you merge the set containing u with the set containing v . UNION(u, v)表示将包含u的集合与包含v的集合合并。 In other words, if u and v are in different sets, the UNION operation will form a new set containing all the members of the sets FIND-SET(u) and FIND-SET(v) . 换句话说，如果u和v在不同的集合中， UNION操作将形成一个新集合，其中包含集合FIND-SET(u)和FIND-SET(v)所有成员。

When we implement these operations with the disjoint-set data structure , the key idea is that every set is represented by a leader. 当我们使用不相交集数据结构实现这些操作时，关键思想是每个集合都由一个领导者表示。 Every vertex has a pointer to some vertex in its set. 每个顶点都有一个指向其集合中某个顶点的指针。 The leader of the set is a vertex that points to itself. 集合的领导者是指向自身的顶点。 All other vertices point to a parent, and the pointers form a tree structure that has the leader as its root. 所有其他顶点指向父节点，指针形成以领导者为根的树结构。

To implement FIND-SET(u) , you follow pointers starting from u until you reach the set leader, which is the only vertex in the set that points to itself. 要实现FIND-SET(u) ，您可以跟踪从u开始的指针，直到到达set leader，这是集合中唯一指向自身的顶点。

To implement UNION(u, v) , you make the leader of one set point to the leader of the other set. 要实现UNION(u, v) ，您可以将一个设定点的领导者设为另一个设定点的领导者。

These operations can be optimized with the ideas of union by rank and path compression. 这些操作可以通过秩和路径压缩的联合思想进行优化。

Union by rank means that you keep track of the maximum number of pointers from any vertex in a set to the leader. 按等级联合意味着您可以跟踪从集合中任何顶点到领导者的最大指针数。 That is the same as the height of the tree formed by the pointers. 这与指针形成的树的高度相同。 You can update the rank by carrying out a constant number of steps for every UNION operation, which is the only time a set's rank can change. 您可以通过为每个UNION操作执行一定数量的步骤来更新排名，这是集合排名唯一可以更改的时间。 Suppose that we are merging sets A and B such that A has a larger rank than B. We make the leader of B point to the leader of A. The resulting set has the same rank as A. If A has a smaller rank than B, we make the leader of A point to the leader of B, and the resulting set has the same rank as B. If A and B have the same rank, it doesn't matter which leader we choose. 假设我们合并集合A和B使得A具有比B更大的等级。我们使B的领导者指向A的领导者。结果集合具有与A相同的等级。如果A具有比B更小的等级。，我们使A点的领导者成为B的领导者，并且得到的集合具有与B相同的等级。如果A和B具有相同的等级，则我们选择哪个领导者并不重要。 Whether we make the leader of A point to the leader of B or vice versa, the resulting set will have a rank that is one greater than the rank of A. 无论我们将A点的领导者指向B的领导者还是反之亦然，结果集的排名将比A的排名大一。

Path compression means that when we perform the FIND operation, which entails following a sequence of pointers to the leader of the set, we make all of the vertices we encounter along the way point directly to the leader. 路径压缩意味着当我们执行FIND操作时，它需要跟随一组指向集合的领导者的指针，我们使我们遇到的所有顶点直接指向领导者。 This increases the amount of work for the current FIND operation by only a constant factor, and it reduces the amount of work for future invocations of FIND . 这仅增加了当前FIND操作的工作量，并且减少了将来调用FIND的工作量。

If you implement union by rank and path compression, you will have a blazingly fast union-find implementation. 如果您通过排名和路径压缩实现union，那么您将拥有一个非常快速的union-find实现。 The time complexity for n operations is O(n α(n)) , where α is the inverse Ackermann function. n次操作的时间复杂度为O（nα（n）） ，其中α是逆Ackermann函数。 This function grows so slowly that if n is the number of atoms in the universe, α(n) is 5. Thus, it is practically a constant, and the optimized disjoint-set data structure is practically a linear-time implementation of union-find. 这个函数增长得如此之慢，如果n是宇宙中原子的数量，则α（n）为5.因此，它实际上是一个常数，优化的不相交集数据结构实际上是联合的线性时间实现 -找。

Answer 2

I won't repeat the set-theoretic description of the union/find algorithm (Kruskal is just a special case of it), but use a simpler approach (upon which you can apply the union by rank and path compression.) 我不会重复联合/查找算法的集合理论描述（Kruskal只是它的一个特例），但是使用更简单的方法（你可以通过秩和路径压缩来应用联合）。

For simplicity I supposed that we have a unique integer ID for each vertex ranging from 0 to order - 1 (say, vertex ID can be used as an index to an array.) 为简单起见，我认为每个顶点都有一个唯一的整数ID，范围从0到1 - 1（例如，顶点ID可以用作数组的索引。）

The naive algorithm is so simple that the code speaks by itself: 朴素的算法非常简单，代码本身就说明了：

int find(int v, int cc[]) {
  while (cc[v] >= 0)
    v = cc[v];
  return v;
}

bool edge_union(int v0, int v1, int cc[]) {
  int r0 = find(v0, cc);
  int r1 = find(v1, cc);
  if (r0 != r1) {
    cc[r1] = r0;
    return true;
  }
  return false;
}

The cc array is initialized with -1 everywhere (and of course its size reflects the graph order.) cc数组在每个地方初始化为-1（当然它的大小反映了图形顺序。）

Path compression can then be done by stacking encountered vertices in the while loop of the find function and then set the same representant to all of them. 然后可以通过在find函数的while循环中堆叠遇到的顶点来完成路径压缩，然后为所有顶点设置相同的表示。

int find2(int v, int cc[]) {
  std::deque<int> stack;
  while (cc[v] >= 0) {
    stack.push_back(v);
    v = cc[v];
  }
  for (auto i : stack) {
    cc[i] = v;
  }
  return v;
}

For the union by rank, we simply use the negative values of the array, the smaller the value, the greater the rank. 对于按等级的并集，我们只使用数组的负值，值越小，等级越大。 Here is the code: 这是代码：

bool edge_union2(int v0, int v1, int cc[]) {
  int r0 = find(v0, cc);
  int r1 = find(v1, cc);
  if (r0 == r1)
    return false;
  if (cc[r0] < cc[r1])
    cc[r1] = r0;
  else {
    if (cc[r1] < cc[r0])
      cc[r0] = r1;
    else {
      cc[r1] = r0;
      cc[r0] -= 1;
    }
  }
  return true;
}

使用Kruskal算法检测图形中的循环

问题描述

2 个解决方案

解决方案1
5 已采纳 2015-05-02 04:42:53

解决方案2
2 2015-05-02 17:14:14

使用Kruskal算法检测图形中的循环

问题描述

2 个解决方案

解决方案1 5 已采纳 2015-05-02 04:42:53

解决方案2 2 2015-05-02 17:14:14

解决方案1
5 已采纳 2015-05-02 04:42:53

解决方案2
2 2015-05-02 17:14:14