简体   繁体   English

不交集数据结构如何影响kruskal算法的性能?

[英]how does kruskal's algorithm's performance gets affected by the disjoint set data structure?

I've got a basic idea on what Kruskal's algorithm is and this is what I discovered: 我对Kruskal的算法有一个基本的了解,这就是我发现的:

This algorithm basically constructs a minimal spanning tree by merging multiple trees and it begins by sorting the edges by their weights. 该算法基本上是通过合并多个树来构造最小生成树,并从按其权重对边缘进行排序开始。 Beginning with an empty sub graph, the algorithm scans the list of edges adding the next edge to the sub graph if it does not create a cycle. 从一个空的子图开始,如果不创建循环,该算法将扫描边列表,将下一条边添加到子图。

Where as disjoint set is a data structure, which actually has few ways to derive the minimal spanning tree by using a linked-list or a forest-tree method. 作为不交集的数据结构,实际上很少有使用链表或森林树方法来得出最小生成树的方法。

What I want to know is, how disjoint set affects the performance of the Kruskal's algorithm? 我想知道的是,不相交集如何影响Kruskal算法性能? Any help could be appreciated. 任何帮助,不胜感激。

Kruskal's algorithm sorts edges at first. Kruskal的算法首先对边缘进行排序。 This can be done in O(E*log(E)) = O(E*log(V^2)) = O(E*2*log(V)) = O(E*log(V)) time. 这可以在O(E*log(E)) = O(E*log(V^2)) = O(E*2*log(V)) = O(E*log(V))时间内完成。

Then iterates through the edges and executes O(E) disjoint-set data structure operations ( 2 find & possibly 1 union on every iteration). 然后遍历边缘并执行O(E)不相交集数据结构操作(每次迭代2 find,可能还有1并集)。 The complexity of the operations depends on the disjoint-set implementation. 操作的复杂性取决于不相交集的实现。 Naive implementation has the O(V) union operation and O(1) find operation. 天真的实现具有O(V)联合操作和O(1)查找操作。 This leads to O(E + V^2) time because union operation would be executed at most V times, but even with disjoint-sets forest with union by rank the complexity of both operations is O(log(V)) (it can be O(α(V)) with the addition of path compression ). 这将导致O(E + V^2)时间,因为联合操作最多将执行V次,但是即使在按联合进行等级的不相交集林中,两个操作的复杂度均为O(log(V)) (它可以是O(α(V))加上路径压缩 )。

Thus, Kruskal's algorithm with a naive disjoint-set data structure implementation: 因此,采用朴素的不交集数据结构实现的Kruskal算法:
O(E*log(V)) + O(E + V^2) = O(E*log(V) + V^2) (in sparse enough graphs second term would dominate) O(E*log(V)) + O(E + V^2) = O(E*log(V) + V^2) (在稀疏的图中第二项将占主导地位)

implementation that has at least union by rank : 至少具有等级联合的实现
O(E*log(V)) + O(E*log(V)) = O(E*log(V))

Data structure 'disjoint set' is important to ensure O(E*logE) overall complexity. 数据结构“不相交集”对于确保O(E * logE)整体复杂性很重要。

Let us consider the standard kruskal algo. 让我们考虑标准的kruskal算法。

KRUSKAL(G):

A = ∅
foreach v ∈ G.V:
    MAKE-SET(v)
foreach (u, v) in G.E ordered by weight(u, v), increasing:
   if FIND-SET(u) ≠ FIND-SET(v):
       A = A ∪ {(u, v)}
       UNION(u, v)
return A

The FIND-SET() and UNION() methods identify the sets(or forests) and join them. FIND-SET()和UNION()方法标识集合(或林)并将其加入。 Both these operations can be done in O(1) with 'disjoint set' data structure. 这两个操作都可以在O(1)中使用“不相交集”数据结构来完成。 The overall complexity for the FIND and UNION part is O(E). FIND和UNION部分的总体复杂度为O(E)。

put together we have O(V) + O(E * logE) + O(E) = O(E * logE) 放在一起我们有O(V)+ O(E * logE)+ O(E)= O(E * logE)

Hence, data structure 'disjoint set' is important to ensure O(E * logE) overall complexity. 因此,数据结构“不相交集”对于确保O(E * logE)整体复杂性很重要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM