简体   繁体   English

从边列表计算创建的图形数量和每个图形中的顶点数量

[英]Calculating the number of graphs created and the number of vertices in each graph from a list of edges

Given a list of edges such as, edges = [[1,2],[2,3],[3,1],[4,5]] 给出边缘列表,例如,edge = [[1,2],[2,3],[3,1],[4,5]]

I need to find how many graphs are created, by this I mean how many groups of components are created by these edges. 我需要找到创建了多少个图,我的意思是这些边创建了多少组元素。 Then get the number of vertices in the group of components. 然后获取组件组中的顶点数。

However, I am required to be able to handle 10^5 edges, and i am currently having trouble completing the task for large number of edges. 但是,我需要能够处理10 ^ 5个边缘,而且我目前无法完成大量边缘的任务。

My algorithm is currently getting the list of edges= [[1,2],[2,3],[3,1],[4,5]] and merging each list as set if they have a intersection, this will output a new list that now contains group components such as , graphs = [[1,2,3],[4,5]] 我的算法当前正在获取edge = [[1,2],[2,3],[3,1],[4,5]]的列表,如果它们有一个交集,则将每个列表合并为set,这将输出一个新的列表,现在包含组组件,如:graphs = [[1,2,3],[4,5]]

There are two connected components : [1,2,3] are connected and [4,5] are connected as well. 有两个连接的组件:[1,2,3]连接,[4,5]也连接。

I would like to know if there is a much better way of doing this task. 我想知道是否有更好的方法来完成这项任务。

def mergeList(edges):
    sets = [set(x) for x in edges if x]
    m = 1
    while m:
        m = 0
        res = []
        while sets:
            common, r = sets[0], sets[1:]
            sets = []
            for x in r:
                if x.isdisjoint(common):
                    sets.append(x)
                else:
                    m = 1
                    common |= x
            res.append(common)
        sets = res
    return sets

I would like to try doing this in a dictionary or something efficient, because this is toooo slow. 我想尝试在字典或高效的东西中这样做,因为这太慢了。

A basic iterative graph traversal in Python isn't too bad. Python中基本的迭代图遍历并不算太糟糕。

import collections


def connected_components(edges):
    # build the graph
    neighbors = collections.defaultdict(set)
    for u, v in edges:
        neighbors[u].add(v)
        neighbors[v].add(u)
    # traverse the graph
    sizes = []
    visited = set()
    for u in neighbors.keys():
        if u in visited:
            continue
        # visit the component that includes u
        size = 0
        agenda = {u}
        while agenda:
            v = agenda.pop()
            visited.add(v)
            size += 1
            agenda.update(neighbors[v] - visited)
        sizes.append(size)
    return sizes

Do you need to write your own algorithm? 你需要编写自己的算法吗? networkx already has algorithms for this. networkx已经有了算法。

To get the length of each component try 要获得每个组件的长度,请尝试

import networkx as nx

G = nx.Graph()
G.add_edges_from([[1,2],[2,3],[3,1],[4,5]])

components = []
for graph in nx.connected_components(G):
  components.append([graph, len(graph)])

components
# [[set([1, 2, 3]), 3], [set([4, 5]), 2]]

You could use Disjoint-set data structure: 您可以使用Disjoint-set数据结构:

edges = [[1,2],[2,3],[3,1],[4,5]]
parents = {}
size = {}

def get_ancestor(parents, item):
    # Returns ancestor for a given item and compresses path
    # Recursion would be easier but might blow stack
    stack = []
    while True:
        parent = parents.setdefault(item, item)
        if parent == item:
            break
        stack.append(item)
        item = parent

    for item in stack:
        parents[item] = parent

    return parent


for x, y in edges:
    x = get_ancestor(parents, x)
    y = get_ancestor(parents, y)
    size_x = size.setdefault(x, 1)
    size_y = size.setdefault(y, 1)
    if size_x < size_y:
        parents[x] = y
        size[y] += size_x
    else:
        parents[y] = x
        size[x] += size_y

print(sum(1 for k, v in parents.items() if k == v)) # 2

In above parents is a dict where vertices are keys and ancestors are values. 在上面, parents是一个字典,其中顶点是键,祖先是值。 If given vertex doesn't have a parent then the value is the vertex itself. 如果给定的顶点没有父级,则该值是顶点本身。 For every edge in the list the ancestor of both vertices is set the same. 对于列表中的每个边,两个顶点的祖先设置相同。 Note that when current ancestor is queried the path is compressed so following queries can be done in O(1) time. 请注意,当查询当前祖先时,路径会被压缩,因此可以在O(1)时间内完成查询。 This allows the whole algorithm to have O(n) time complexity. 这允许整个算法具有O(n)时间复杂度。

Update 更新

In case components are required instead of just number of them the resulting dict can be iterated to produce it: 如果需要组件而不仅仅是它们的数量,则可以迭代生成的dict以生成它:

from collections import defaultdict

components = defaultdict(list)
for k, v in parents.items():
    components[v].append(k)

print(components)

Output: 输出:

defaultdict(<type 'list'>, {3: [1, 2, 3], 5: [4, 5]})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM