简体   繁体   English

给定表示对的 n 个元组,返回一个包含连接元组的列表

[英]Given n tuples representing pairs, return a list with connected tuples

Given n tuples, write a function that will return a list with connected values.给定 n 个元组,编写一个函数,该函数将返回一个带有连接值的列表。

Example:例子:

pairs = [(1,62),
    (1,192),
    (1,168),
    (64,449),
    (263,449),      
    (192,289),
    (128,263),
    (128,345),
    (3,10),
    (10,11)
    ]

result:结果:

[[1,62,192,168,289],
 [64,449,263,128,345,449],
 [3,10,11]]     

I believe it could be solved using graphs or trees as data structure, creating nodes for each value and and edges for each pair with each tree or graph representing connected values, but I didn't find a solution yet.我相信它可以使用图或树作为数据结构来解决,为每个值创建节点,并为每对创建边,每个树或图表示连接的值,但我还没有找到解决方案。

What would be the best way to produce in python a result that yields a list of connected values for those pairs?在 python 中生成结果的最佳方法是什么,从而为这些对生成连接值列表?

You can solve it with Disjoint Set (Union-Find) implementation.您可以使用Disjoint Set (Union-Find)实现来解决它。

Initialize the structure djs with all of the numbers.用所有数字初始化结构djs Then for each tuple (x,y) , call djs.merge(x,y) .然后对于每个元组(x,y) ,调用djs.merge(x,y) Now for each number x , create a new set for it iff djs.sameSet(x,)==false for an arbitrary y from each existing set.现在对于每个数字x ,为它创建一个新的集合,如果djs.sameSet(x,)==false为每个现有集合中的任意y

Maybe that could help you.也许可以帮助你。

I didn't know this problem already had a name (thanks avim!), so I went ahead and solved it naively.我不知道这个问题已经有了名字(感谢avim!),所以我继续天真地解决了它。

This solution is somewhat similar to Eli Rose's.这个解决方案有点类似于 Eli Rose 的解决方案。 I decided to post it though, because it is a bit more efficient for large lists of pairs, due to the fact that the lists_by_element dictionary keeps track of the list an element is in, allowing us to avoid iterating through all the lists and their items every time we need to add a new item.不过我还是决定发布它,因为它对于大型对列表更有效,因为lists_by_element字典会跟踪元素所在的列表,从而避免遍历所有列表及其项目每次我们需要添加一个新项目。

Here's the code:这是代码:

def connected_tuples(pairs):
    # for every element, we keep a reference to the list it belongs to
    lists_by_element = {}

    def make_new_list_for(x, y):
        lists_by_element[x] = lists_by_element[y] = [x, y]

    def add_element_to_list(lst, el):
        lst.append(el)
        lists_by_element[el] = lst

    def merge_lists(lst1, lst2):
        merged_list = lst1 + lst2
        for el in merged_list:
            lists_by_element[el] = merged_list

    for x, y in pairs:
        xList = lists_by_element.get(x)
        yList = lists_by_element.get(y)

        if not xList and not yList:
            make_new_list_for(x, y)

        if xList and not yList:
            add_element_to_list(xList, y)

        if yList and not xList:
            add_element_to_list(yList, x)            

        if xList and yList and xList != yList:
            merge_lists(xList, yList)

    # return the unique lists present in the dictionary
    return set(tuple(l) for l in lists_by_element.values())

And here's how it works: http://ideone.com/tz9t7m这是它的工作原理: http : //ideone.com/tz9t7m

Another solution that is more compact than wOlf's but handles merge contrary to Eli's:另一种比 wOlf 更紧凑但处理合并的解决方案与 Eli 的相反:

def connected_components(pairs):
    components = []
    for a, b in pairs:
        for component in components:
            if a in component:
                for i, other_component in enumerate(components):
                    if b in other_component and other_component != component: # a, and b are already in different components: merge
                        component.extend(other_component)
                        components[i:i+1] = []
                        break # we don't have to look for other components for b
                else: # b wasn't found in any other component
                    if b not in component:
                        component.append(b)
                break # we don't have to look for other components for a
            if b in component: # a wasn't in in the component 
                component.append(a)
                break # we don't have to look further
        else: # neither a nor b were found
            components.append([a, b])
    return components

Notice that I rely on breaking out of loops when I find an element in a component so that I can use the else clause of the loop to handle the case where the elements are not yet in any component (the else is executed if the loop ended without break ).请注意,当我在组件中找到元素时,我依赖于打破循环,以便我可以使用循环的else子句来处理元素尚未在任何组件中的情况(如果循环结束,则执行else没有break )。

You also could use networkx as a dependency.您也可以使用networkx作为依赖项。

import networkx as nx

pairs = [(1,62),
        (1,192),
        (1,168),
        (64,449),
        (263,449),      
        (192,289),
        (128,263),
        (128,345),
        (3,10),
        (10,11)]


G = nx.Graph()
G.add_edges_from(pairs)
list(nx.connected_components(G))

It seems like you have a graph (in the form of a list of edges) that may not be all in one piece ("connected") and you want to divide it up into pieces ("components").似乎您有一个图形(以边列表的形式),它可能不是一个整体(“连接”),并且您想将其分成几部分(“组件”)。

Once we think about it in the language of graphs, we're mostly done.一旦我们用图形语言思考它,我们就大功告成了。 We can keep a list of all the components we've found this far (these will be sets of nodes) and add a node to the set if its partner is already there.我们可以保留到目前为止找到的所有组件的列表(这些将是节点集),如果其伙伴已经存在,则将节点添加到集合中。 Otherwise, make a new component for this pair.否则,为这对创建一个新组件。

def graph_components(edges):
    """
    Given a graph as a list of edges, divide the nodes into components.

    Takes a list of pairs of nodes, where the nodes are integers.
    Returns a list of sets of nodes (the components).
    """

    # A list of sets.
    components = []

    for v1, v2 in edges:
        # See if either end of the edge has been seen yet.
        for component in components:
            if v1 in component or v2 in component:
                # Add both vertices -- duplicates will vanish.
                component.add(v1)
                component.add(v2)
                break
        else:
            # If neither vertex is already in a component.
            components.append({v1, v2})

    return components

I've used the weird for ... else construction for the sake of making this one function -- the else gets executed if a break statement was not reached during the for .我已经使用了奇怪的for ... else构造来制作这个函数——如果在for期间没有到达break语句,则else将被执行。 The inner loop could just as well be a function returning True or False .内部循环也可以是一个返回TrueFalse的函数。


EDIT: As Francis Colas points out, this approach is too greedy.编辑:正如 Francis Colas 指出的那样,这种方法太贪婪了。 Here's a completely different approach (many thanks to Edward Mann for this beautiful DFS implementation).这是一种完全不同的方法(非常感谢 Edward Mann 提供了这个漂亮的 DFS 实现)。

This approach is based upon constructing a graph, then doing traversals on it until we run out of unvisited nodes.这种方法基于构建一个图,然后对其进行遍历,直到用完未访问的节点。 It should run in linear time (O(n) to construct the graph, O(n) to do all the traversals, and I believe O(n) just to do the set difference).它应该在线性时间内运行(O(n) 来构建图,O(n) 来完成所有的遍历,我相信 O(n) 只是为了做集合差)。

from collections import defaultdict

def dfs(start, graph):
    """
    Does depth-first search, returning a set of all nodes seen.
    Takes: a graph in node --> [neighbors] form.
    """
    visited, worklist = set(), [start]

    while worklist:
        node = worklist.pop()
        if node not in visited:
            visited.add(node)
            # Add all the neighbors to the worklist.
            worklist.extend(graph[node])

    return visited

def graph_components(edges):
    """
    Given a graph as a list of edges, divide the nodes into components.
    Takes a list of pairs of nodes, where the nodes are integers.
    """

    # Construct a graph (mapping node --> [neighbors]) from the edges.
    graph = defaultdict(list)
    nodes = set()

    for v1, v2 in edges:
        nodes.add(v1)
        nodes.add(v2)

        graph[v1].append(v2)
        graph[v2].append(v1)

    # Traverse the graph to find the components.
    components = []

    # We don't care what order we see the nodes in.
    while nodes:
        component = dfs(nodes.pop(), graph)
        components.append(component)

        # Remove this component from the nodes under consideration.
        nodes -= component

    return components

I came up with 2 different solutions:我想出了两种不同的解决方案:

The first one I prefer is about linking each record with a parent.我更喜欢的第一个是将每条记录与父记录链接起来。 And then of course navigate further in the hierarchy until an element is mapped to itself.然后当然在层次结构中进一步导航,直到元素映射到自身。


So the code would be:所以代码将是:

def build_mapping(input_pairs):
    mapping = {}

    for pair in input_pairs:
        left = pair[0]
        right = pair[1]

        parent_left = None if left not in mapping else mapping[left]
        parent_right = None if right not in mapping else mapping[right]

        if parent_left is None and parent_right is None:
            mapping[left] = left
            mapping[right] = left

            continue

        if parent_left is not None and parent_right is not None:
            if parent_left == parent_right:
                continue

            top_left_parent = mapping[parent_left]
            top_right_parent = mapping[parent_right]
            while top_left_parent != mapping[top_left_parent]:
                mapping[left] = top_left_parent
                top_left_parent = mapping[top_left_parent]

            mapping[top_left_parent] = top_right_parent
            mapping[left] = top_right_parent

            continue 

        if parent_left is None:
            mapping[left] = parent_right
        else:
            mapping[right] = parent_left

    return mapping


def get_groups(input_pairs):
    mapping = build_mapping(input_pairs)

    groups = {}
    for elt, parent in mapping.items():
        if parent not in groups:
            groups[parent] = set()

        groups[parent].add(elt)

    return list(groups.values())

So, with the following input:因此,使用以下输入:

groups = get_groups([('A', 'B'), ('A', 'C'), ('D', 'A'), ('E', 'F'), 
                     ('F', 'C'), ('G', 'H'), ('I', 'J'), ('K', 'L'), 
                     ('L', 'M'), ('M', 'N')])

We get:我们得到:

[{'A', 'B', 'C', 'D', 'E', 'F'}, {'G', 'H'}, {'I', 'J'}, {'K', 'L', 'M', 'N'}]

The second maybe less efficient solution would be:第二种可能效率较低的解决方案是:

def get_groups_second_method(input_pairs):
    groups = []

    for pair in input_pairs:
        left = pair[0]
        right = pair[1]

        left_group = None
        right_group = None
        for i in range(0, len(groups)):
            group = groups[i]

            if left in group:
                left_group = (group, i)

            if right in group:
                right_group = (group, i)

        if left_group is not None and right_group is not None:
            merged = right_group[0].union(left_group[0])
            groups[right_group[1]] = merged
            groups.pop(left_group[1])
            continue

        if left_group is None and right_group is None:
            new_group = {left, right}
            groups.append(new_group)
            continue

        if left_group is None:
            right_group[0].add(left)
        else:
            left_group[0].add(right)

    return groups

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM