Kruskal 的算法能否以这种方式实现，而不是使用不相交集的森林？

Question

I am studying Kruskal's MST from this geeksforgeeks article .我正在从这篇 geeksforgeeks 文章中学习 Kruskal 的 MST。 The steps given are:给出的步骤是：

Sort all the edges in non-decreasing order of their weight.按重量的非递减顺序对所有边进行排序。
Pick the smallest edge.选择最小的边。 Check if it forms a cycle with the spanning tree formed so far.检查它是否与到目前为止形成的生成树形成一个循环。 If cycle is not formed, include this edge.如果未形成环，则包括此边。 Else, discard it.否则，丢弃它。
Repeat step (2) until there are (V-1) edges in the spanning tree.重复步骤（2），直到生成树中有（V-1）条边。

I really don't feel any need to use disjoint set.我真的觉得没有必要使用不相交的集合。 Instead for checking a cycle we can just store vertices in a visited array and mark them as true whenever an edge is selected.为了检查循环，我们可以将顶点存储在访问数组中，并在选择边时将它们标记为真。 Looping through the program if we find an edge whose both vertices are in the visited array we ignore that edge.循环遍历程序，如果我们找到一条边，其两个顶点都在访问数组中，我们将忽略该边。

In other words, instead of storing a disjoint-set forest, can't we just store an array of bits indicating which vertices have been linked to another edge in some previous step?换句话说，不是存储一个不相交集的森林，我们不能只存储一个位数组来指示哪些顶点在之前的某个步骤中已链接到另一条边？

Answer 1

I really don't feel any need to use disjoint set. 我真的不需要使用不交集。 Instead for checking a cycle we can just store vertices in a visited array and mark them as true whenever an edge is selected. 代替检查周期，我们可以将顶点存储在访问数组中，并在选择边时将其标记为true。 Looping through the program if we find an edge whose both vertices are in the visited array we ignore that edge. 如果我们找到一条两条顶点都位于访问数组中的边，则在程序中循环，我们将忽略该边。

Yes, of course you can do that. 是的，您当然可以做到。 The point of using a disjoint set in this algorithm is performance . 在该算法中使用不交集的重点是性能。 Use of a suitable disjoint set implementation yields better asymptotic performance than using a List can do. 与使用List相比，使用合适的不相交集实现可产生更好的渐近性能。

Answer 2

The approach you're describing will not work properly in all cases. 您所描述的方法在所有情况下均无法正常工作。 As an example, consider this line graph: 例如，考虑以下折线图：

A - - B - - C - - D

Let's assume AB has weight 1, CD has weight 2, and B - C has weight 3. What will Kruskal's algorithm do here? 假设AB的权重为1，CD的权重为2，B-C的权重为3。Kruskal的算法在此做什么？ First, it'll add in A - B, then C - D, and then B - C. 首先，它将添加A-B，然后添加C-D，然后添加B-C。

Now imagine what your implementation will do. 现在想象一下您的实现会做什么。 When we add A - B, you'll mark A and B as having been visited. 当我们添加A-B时，您会将A和B标记为已访问。 When we then add C - D, you'll mark C and D as having been visited. 然后，当我们添加C-D时，您将C和D标记为已访问。 But then when we try to add B - C, since both B and C are visited, you'll decide not to add the edge, leaving a result that isn't connected. 但是，当我们尝试添加B-C时，由于同时访问了B和C，因此您将决定不添加边，从而留下未连接的结果。

The issue here is that when building up an MST you may add edges linking nodes that have already been linked to other nodes in the past. 这里的问题是，在建立MST时，您可能会添加链接过去已链接到其他节点的节点的边。 The criterion for adding an edge is therefore less “have these nodes been linked before?” and more “is there already a path between these nodes?” That's where the disjoint-set forest comes in. 因此，添加边缘的标准是“这些节点之前已经链接过吗？”较少，而更多的是“这些节点之间已经存在路径了吗？”，这就是不交集的森林出现的地方。

It's great that you're poking and prodding conventional implementations and trying to find ways to improve them. 戳破和推动常规实现，并试图找到改进方法的做法真是太好了。 You'll learn a lot about those algorithms if you do! 如果您这样做的话，您将学到很多有关这些算法的知识！ In this case, it just so happens that what you're proposing doesn't quite work, and seeing why it doesn't work helps shed light on why the existing approach is what it is. 在这种情况下，恰好发生了您所提议的内容行不通的情况，而了解其为何行不通的情况有助于您弄清现有方法为何如此。

Answer 3

Your confusion probably derives from step 2, which is worded incorrectly. 您的困惑可能来自第2步，其措词不正确。

It says "Check if it forms a cycle with the spanning tree formed so far.", but what has been formed so far is not a spanning tree. 它说：“检查它是否与迄今为止形成的生成树一起形成一个循环。”，但到目前为止所形成的不是生成树。 It's a forest of disconnected spanning trees, and some unvisited vertexes. 这是一个由不连贯的生成树和一些未访问的顶点组成的森林。

Your algorithm fails when you find an edge that connects two different spanning trees in the forest. 在林中找到连接两个不同生成树的边时，算法将失败。 Those vertexes will have been visited before, but the edge does not make a cycle. 这些顶点将在以前访问过，但是边缘不会循环。

Kruskal 的算法能否以这种方式实现，而不是使用不相交集的森林？

问题描述

2 个解决方案

解决方案1
1 2019-02-02 14:35:26

解决方案2
1 已采纳 2019-02-02 17:21:25

解决方案3
0 2019-02-02 20:47:24

Kruskal 的算法能否以这种方式实现，而不是使用不相交集的森林？

问题描述

2 个解决方案

解决方案1 1 2019-02-02 14:35:26

解决方案2 1 已采纳 2019-02-02 17:21:25

解决方案3 0 2019-02-02 20:47:24

解决方案1
1 2019-02-02 14:35:26

解决方案2
1 已采纳 2019-02-02 17:21:25

解决方案3
0 2019-02-02 20:47:24