[英]Given n tuples representing pairs, return a list with connected tuples
Given n tuples, write a function that will return a list with connected values.给定 n 个元组,编写一个函数,该函数将返回一个带有连接值的列表。
Example:例子:
pairs = [(1,62),
(1,192),
(1,168),
(64,449),
(263,449),
(192,289),
(128,263),
(128,345),
(3,10),
(10,11)
]
result:结果:
[[1,62,192,168,289],
[64,449,263,128,345,449],
[3,10,11]]
I believe it could be solved using graphs or trees as data structure, creating nodes for each value and and edges for each pair with each tree or graph representing connected values, but I didn't find a solution yet.我相信它可以使用图或树作为数据结构来解决,为每个值创建节点,并为每对创建边,每个树或图表示连接的值,但我还没有找到解决方案。
What would be the best way to produce in python a result that yields a list of connected values for those pairs?在 python 中生成结果的最佳方法是什么,从而为这些对生成连接值列表?
You can solve it with Disjoint Set (Union-Find) implementation.您可以使用Disjoint Set (Union-Find)实现来解决它。
Initialize the structure djs
with all of the numbers.用所有数字初始化结构
djs
。 Then for each tuple (x,y)
, call djs.merge(x,y)
.然后对于每个元组
(x,y)
,调用djs.merge(x,y)
。 Now for each number x
, create a new set for it iff djs.sameSet(x,)==false
for an arbitrary y
from each existing set.现在对于每个数字
x
,为它创建一个新的集合,如果djs.sameSet(x,)==false
为每个现有集合中的任意y
。
I didn't know this problem already had a name (thanks avim!), so I went ahead and solved it naively.我不知道这个问题已经有了名字(感谢avim!),所以我继续天真地解决了它。
This solution is somewhat similar to Eli Rose's.这个解决方案有点类似于 Eli Rose 的解决方案。 I decided to post it though, because it is a bit more efficient for large lists of pairs, due to the fact that the
lists_by_element
dictionary keeps track of the list an element is in, allowing us to avoid iterating through all the lists and their items every time we need to add a new item.不过我还是决定发布它,因为它对于大型对列表更有效,因为
lists_by_element
字典会跟踪元素所在的列表,从而避免遍历所有列表及其项目每次我们需要添加一个新项目。
Here's the code:这是代码:
def connected_tuples(pairs):
# for every element, we keep a reference to the list it belongs to
lists_by_element = {}
def make_new_list_for(x, y):
lists_by_element[x] = lists_by_element[y] = [x, y]
def add_element_to_list(lst, el):
lst.append(el)
lists_by_element[el] = lst
def merge_lists(lst1, lst2):
merged_list = lst1 + lst2
for el in merged_list:
lists_by_element[el] = merged_list
for x, y in pairs:
xList = lists_by_element.get(x)
yList = lists_by_element.get(y)
if not xList and not yList:
make_new_list_for(x, y)
if xList and not yList:
add_element_to_list(xList, y)
if yList and not xList:
add_element_to_list(yList, x)
if xList and yList and xList != yList:
merge_lists(xList, yList)
# return the unique lists present in the dictionary
return set(tuple(l) for l in lists_by_element.values())
And here's how it works: http://ideone.com/tz9t7m这是它的工作原理: http : //ideone.com/tz9t7m
Another solution that is more compact than wOlf's but handles merge contrary to Eli's:另一种比 wOlf 更紧凑但处理合并的解决方案与 Eli 的相反:
def connected_components(pairs):
components = []
for a, b in pairs:
for component in components:
if a in component:
for i, other_component in enumerate(components):
if b in other_component and other_component != component: # a, and b are already in different components: merge
component.extend(other_component)
components[i:i+1] = []
break # we don't have to look for other components for b
else: # b wasn't found in any other component
if b not in component:
component.append(b)
break # we don't have to look for other components for a
if b in component: # a wasn't in in the component
component.append(a)
break # we don't have to look further
else: # neither a nor b were found
components.append([a, b])
return components
Notice that I rely on breaking out of loops when I find an element in a component so that I can use the else
clause of the loop to handle the case where the elements are not yet in any component (the else
is executed if the loop ended without break
).请注意,当我在组件中找到元素时,我依赖于打破循环,以便我可以使用循环的
else
子句来处理元素尚未在任何组件中的情况(如果循环结束,则执行else
没有break
)。
You also could use networkx as a dependency.您也可以使用networkx作为依赖项。
import networkx as nx
pairs = [(1,62),
(1,192),
(1,168),
(64,449),
(263,449),
(192,289),
(128,263),
(128,345),
(3,10),
(10,11)]
G = nx.Graph()
G.add_edges_from(pairs)
list(nx.connected_components(G))
It seems like you have a graph (in the form of a list of edges) that may not be all in one piece ("connected") and you want to divide it up into pieces ("components").似乎您有一个图形(以边列表的形式),它可能不是一个整体(“连接”),并且您想将其分成几部分(“组件”)。
Once we think about it in the language of graphs, we're mostly done.一旦我们用图形语言思考它,我们就大功告成了。 We can keep a list of all the components we've found this far (these will be sets of nodes) and add a node to the set if its partner is already there.
我们可以保留到目前为止找到的所有组件的列表(这些将是节点集),如果其伙伴已经存在,则将节点添加到集合中。 Otherwise, make a new component for this pair.
否则,为这对创建一个新组件。
def graph_components(edges):
"""
Given a graph as a list of edges, divide the nodes into components.
Takes a list of pairs of nodes, where the nodes are integers.
Returns a list of sets of nodes (the components).
"""
# A list of sets.
components = []
for v1, v2 in edges:
# See if either end of the edge has been seen yet.
for component in components:
if v1 in component or v2 in component:
# Add both vertices -- duplicates will vanish.
component.add(v1)
component.add(v2)
break
else:
# If neither vertex is already in a component.
components.append({v1, v2})
return components
I've used the weird for ... else
construction for the sake of making this one function -- the else
gets executed if a break
statement was not reached during the for
.我已经使用了奇怪的
for ... else
构造来制作这个函数——如果在for
期间没有到达break
语句,则else
将被执行。 The inner loop could just as well be a function returning True
or False
.内部循环也可以是一个返回
True
或False
的函数。
EDIT: As Francis Colas points out, this approach is too greedy.编辑:正如 Francis Colas 指出的那样,这种方法太贪婪了。 Here's a completely different approach (many thanks to Edward Mann for this beautiful DFS implementation).
这是一种完全不同的方法(非常感谢 Edward Mann 提供了这个漂亮的 DFS 实现)。
This approach is based upon constructing a graph, then doing traversals on it until we run out of unvisited nodes.这种方法基于构建一个图,然后对其进行遍历,直到用完未访问的节点。 It should run in linear time (O(n) to construct the graph, O(n) to do all the traversals, and I believe O(n) just to do the set difference).
它应该在线性时间内运行(O(n) 来构建图,O(n) 来完成所有的遍历,我相信 O(n) 只是为了做集合差)。
from collections import defaultdict
def dfs(start, graph):
"""
Does depth-first search, returning a set of all nodes seen.
Takes: a graph in node --> [neighbors] form.
"""
visited, worklist = set(), [start]
while worklist:
node = worklist.pop()
if node not in visited:
visited.add(node)
# Add all the neighbors to the worklist.
worklist.extend(graph[node])
return visited
def graph_components(edges):
"""
Given a graph as a list of edges, divide the nodes into components.
Takes a list of pairs of nodes, where the nodes are integers.
"""
# Construct a graph (mapping node --> [neighbors]) from the edges.
graph = defaultdict(list)
nodes = set()
for v1, v2 in edges:
nodes.add(v1)
nodes.add(v2)
graph[v1].append(v2)
graph[v2].append(v1)
# Traverse the graph to find the components.
components = []
# We don't care what order we see the nodes in.
while nodes:
component = dfs(nodes.pop(), graph)
components.append(component)
# Remove this component from the nodes under consideration.
nodes -= component
return components
I came up with 2 different solutions:我想出了两种不同的解决方案:
The first one I prefer is about linking each record with a parent.我更喜欢的第一个是将每条记录与父记录链接起来。 And then of course navigate further in the hierarchy until an element is mapped to itself.
然后当然在层次结构中进一步导航,直到元素映射到自身。
So the code would be:所以代码将是:
def build_mapping(input_pairs):
mapping = {}
for pair in input_pairs:
left = pair[0]
right = pair[1]
parent_left = None if left not in mapping else mapping[left]
parent_right = None if right not in mapping else mapping[right]
if parent_left is None and parent_right is None:
mapping[left] = left
mapping[right] = left
continue
if parent_left is not None and parent_right is not None:
if parent_left == parent_right:
continue
top_left_parent = mapping[parent_left]
top_right_parent = mapping[parent_right]
while top_left_parent != mapping[top_left_parent]:
mapping[left] = top_left_parent
top_left_parent = mapping[top_left_parent]
mapping[top_left_parent] = top_right_parent
mapping[left] = top_right_parent
continue
if parent_left is None:
mapping[left] = parent_right
else:
mapping[right] = parent_left
return mapping
def get_groups(input_pairs):
mapping = build_mapping(input_pairs)
groups = {}
for elt, parent in mapping.items():
if parent not in groups:
groups[parent] = set()
groups[parent].add(elt)
return list(groups.values())
So, with the following input:因此,使用以下输入:
groups = get_groups([('A', 'B'), ('A', 'C'), ('D', 'A'), ('E', 'F'),
('F', 'C'), ('G', 'H'), ('I', 'J'), ('K', 'L'),
('L', 'M'), ('M', 'N')])
We get:我们得到:
[{'A', 'B', 'C', 'D', 'E', 'F'}, {'G', 'H'}, {'I', 'J'}, {'K', 'L', 'M', 'N'}]
The second maybe less efficient solution would be:第二种可能效率较低的解决方案是:
def get_groups_second_method(input_pairs):
groups = []
for pair in input_pairs:
left = pair[0]
right = pair[1]
left_group = None
right_group = None
for i in range(0, len(groups)):
group = groups[i]
if left in group:
left_group = (group, i)
if right in group:
right_group = (group, i)
if left_group is not None and right_group is not None:
merged = right_group[0].union(left_group[0])
groups[right_group[1]] = merged
groups.pop(left_group[1])
continue
if left_group is None and right_group is None:
new_group = {left, right}
groups.append(new_group)
continue
if left_group is None:
right_group[0].add(left)
else:
left_group[0].add(right)
return groups
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.