简体   繁体   English

使用联合查找在 Python 中实现 Kruskal 算法

[英]Implementing Kruskal's Algorithm in Python using Union Find

I'm trying to implement Kruskal's algorithm in Python using the union-find data structure.我正在尝试使用 union-find 数据结构在 Python 中实现 Kruskal 算法。 My implementation works on the small example, I have developed here, but it has a small problem on the much larger homework graph.我的实现适用于我在这里开发的小示例,但在更大的作业图上存在一个小问题。 Can you help me see what would be wrong with this implementation?你能帮我看看这个实现有什么问题吗?

Here is my implementation:这是我的实现:

class UnionFind:

    def __init__(self,val,leader):
        self.val = val
        self.leader = leader

    def changeLeader(self,leader):
        self.leader = leader

    def returnLeader(self):
        return self.leader

from collections import defaultdict
def kruskal(graph,edges, N):
    T = dict()
    sizes = defaultdict(lambda: 0)
    edgeWeights = []

    for indx, edge in enumerate(edges):
        n1 = graph[edge][0]
        n2 = graph[edge][1]

#         print("edge is", edge,"nodes",n1.val,n2.val,"leaders",n1.leader,n2.leader)
#         print("state of dict is", T.keys())
        if (n1.leader == n2.leader):
#             print("both nodes part of",n1.leader,'do nothing \n')
            pass

        elif (n1.leader in T.keys()) and (n2.leader not in T.keys()):
#             print("adding ",n2.val, "to group",n1.leader,'\n')
            n2.changeLeader(n1.leader)
            T[n1.leader].append(n2)
            sizes[n1.leader] += 1
            edgeWeights.append(edge)

        elif (n2.leader in T.keys()) and (n1.leader not in T.keys()):
#             print("adding ",n1.val, "to group",n2.leader,'\n')

            n1.changeLeader(n2.leader)
            T[n2.leader].append(n1)
            sizes[n2.leader] += 1
            edgeWeights.append(edge)

        elif (n1.leader in T.keys()) and (n2.leader in T.keys()) and (n1.leader != n2.leader):
#             print("merging groups",n1.leader,n2.leader)
            size1 = sizes[n1.leader]
            size2 = sizes[n2.leader]
            edgeWeights.append(edge)

#             print("sizes are",size1, size2)
            if size1 >= size2:
                for node in T[n2.leader]:
                    if node is not n2:
                        node.changeLeader(n1.leader)
                        T[n1.leader].append(node)
                        sizes[n1.leader] += 1
                        sizes[n2.leader] -= 1

                del T[n2.leader]
                sizes[n2.leader] = 0
                n2.changeLeader(n1.leader)
                T[n1.leader].append(n2)

#                 print("updated list of nodes",T.keys())
#                 for node in T[n1.leader]:
#                     print("includes",node.val)
            else:
                for node in T[n1.leader]:

                    if node is not n1:
                        node.changeLeader(n2.leader)
                        T[n2.leader].append(node)
                        sizes[n2.leader] += 1
                        sizes[n1.leader] -= 1

                del T[n1.leader]
                sizes[n1.leader] = 0
                n1.changeLeader(n2.leader)
                T[n2.leader].append(n1)
        else:
#             print("adding new group",n1.val,n2.val,'\n')
            n2.changeLeader(n1.leader)
            T[n1.leader] = [n1,n2]
            sizes[n1.leader] +=2
            edgeWeights.append(edge)

#         print("updated nodes",graph[edge][0].val,graph[edge][1].val,"leaders",
#               graph[edge][0].leader,graph[edge][1].leader,"\n")
    return T, edgeWeights

Here is the test code:下面是测试代码:

nodes = [UnionFind("A","A"),UnionFind("B","B"),UnionFind("C","C"),UnionFind("D","D"),UnionFind("E","E")]
graph = {1:[nodes[0],nodes[1]],2:[nodes[3],nodes[4]],
         3:[nodes[0],nodes[4]],4:[nodes[0],nodes[3]],
         5:[nodes[0],nodes[2]],6:[nodes[2], nodes[4]],
         7:[nodes[1],nodes[2]]}
N = 5
edges = list(graph.keys())
edges.sort()

T, weight = kruskal(graph,edges,N)

for node in T['A']:
    print(node.val)

print("edges",weight)

And the resulting output:结果输出:

edge is 1 nodes A B leaders A B
state of dict is dict_keys([])
adding new group A B 

updated nodes A B leaders A A 

edge is 2 nodes D E leaders D E
state of dict is dict_keys(['A'])
adding new group D E 

updated nodes D E leaders D D 

edge is 3 nodes A E leaders A D
state of dict is dict_keys(['A', 'D'])
merging groups A D
sizes are 2 2
updated list of nodes dict_keys(['A'])
includes A
includes B
includes D
includes E
updated nodes A E leaders A A 

edge is 4 nodes A D leaders A A
state of dict is dict_keys(['A'])
both nodes part of A do nothing 

updated nodes A D leaders A A 

edge is 5 nodes A C leaders A C
state of dict is dict_keys(['A'])
adding  C to group A 

updated nodes A C leaders A A 

edge is 6 nodes C E leaders A A
state of dict is dict_keys(['A'])
both nodes part of A do nothing 

updated nodes C E leaders A A 

edge is 7 nodes B C leaders A A
state of dict is dict_keys(['A'])
both nodes part of A do nothing 

updated nodes B C leaders A A 

A
B
D
E
C
edges [1, 2, 3, 5]

So the code should end with all nodes in the graph having a single parent.所以代码应该以图中的所有节点都有一个父节点结束。 At least this is my understanding of Kruskal's algorithm.至少这是我对克鲁斯卡尔算法的理解。 It does not on the larger graph, but I cant post this example here.它不在较大的图表上,但我不能在此处发布此示例。 Any ideas based on this code would be very appreciated.任何基于此代码的想法将不胜感激。

"So the code should end with all nodes in the graph having a single parent." “所以代码应该以图中的所有节点都有一个父节点结束。”

No!不! The code should end with all nodes in the graph belonging to a single connected component, but this does not mean they all have the same parent in your union-find data structure.代码应该以图中所有节点都属于一个连接组件结束,但这并不意味着它们在您的联合查找数据结构中都具有相同的父节点。 The data structure defines that two nodes are in the same connected component if they have the same root node , but they might not have the same parent.数据结构定义如果两个节点具有相同的根节点,则它们在同一个连通分量中,但它们可能没有相同的父节点。

To correct your UnionFind class implementation, we need to make the returnLeader method do a search for the root node, instead of just returning the parent:为了更正您的UnionFind类实现,我们需要让returnLeader方法搜索根节点,而不是仅仅返回父节点:

    def returnLeader(self):
        cur = self
        while cur != cur.leader:
            cur = cur.leader
        return cur

This is now logically correct, but we can improve the efficiency for large inputs by doing "path compression".这在逻辑上是正确的,但我们可以通过“路径压缩”来提高大输入的效率。 To save doing the same search many times, update the leader whenever the search finds a different root node.为了避免多次进行相同的搜索,只要搜索找到不同的根节点,就更新领导者。 If we call returnLeader recursively then it will update all the nodes along the path to the root node, too.如果我们递归地调用returnLeader那么它也会更新沿着到根节点的路径的所有节点。

    def returnLeader(self):
        if self.leader != self.leader.leader:
            self.leader = self.leader.returnLeader()
        return self.leader

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM