[英]Finding common elements between tuples in a list in python
如果我在python中有ts
元组列表:
ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
如何获取包含两个或多个此类元组之间的公共元素的列表?
假设ts
的元组和元组中的元素都已经进行了数字排序。
对于此示例,预期输出应为:
ts_output = [703, 803, 903]
以下是我到目前为止的工作:
ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
ts = set(ts)
t1 = set(w for w,x in ts for y,z in ts if w == y) # t1 should only contain 803
print("t1: ", t1)
t2 = set(y for w,x in ts for y,z in ts if x == y) # t2 should only contain 703
print("t2: ", t2)
t3 = set(x for w,x in ts for y,z in ts if x == z) # t3 should only contain 903
print("t3: ", t3)
这是相应的输出:
t1: {803, 901, 902, 702, 703}
t2: {703}
t3: {704, 805, 806, 903, 703}
从上面看,只有t2
给出了预期的输出,但是我不确定t1
和t3
发生了什么。
您可以使用此替代输入来测试您的代码,并且应提供完全相同的输出:
ts = [(701,703), (702,703), (703,704), (803,805), (803,806), (901,903), (902,903), (903,904)]
import collections
ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
flat_list = [item for sublist in ts for item in sublist]
duplicates = [item for item, count in collections.Counter(flat_list).items() if count > 1]
print(duplicates)
有了您的输入,您首先需要整理列表。
#1 Simple and pythonic
flat_list = [item
for sublist in ts
for item in sublist]
#2 More efficient.
import itertools
flat_list = itertools.chain.from_iterable(ts)
在方法#1的情况下,您的flat_list
将是list
对象,在方法#2的情况下,它将是generator
对象。 两者的迭代行为相同。
现在,您可以计算flat_list中的元素了。 如果它们大于1,则它们是重复项。
for item, count in collections.Counter(flat_list).items():
if count > 1:
print(item)
或者您可以使用更多的pythonic列表理解。
duplicates = [item
for item, count in collections.Counter(flat_list).items()
if count > 1]
您需要弄平元组列表。 您可以使用itertools.chain
>>> from itertools import chain
>>> flat_list = list(chain(*ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]
或者,您也可以使用itertools.chain.from_iterables
做itertools.chain.from_iterables
的事情,但是这不需要可迭代的拆包
>>> flat_list = list(itertools.chain.from_iterable(ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]
完成此步骤后,您可以使用Collections.Counter
对平面列表中每个元素的出现进行计数,并过滤一次出现的多个元素。
>>> from collections import Counter
>>> c = Counter(flat_list)
>>> c
>>> Counter({803: 2, 903: 2, 703: 2, 704: 1, 805: 1, 806: 1, 901: 1, 902: 1, 702: 1})
然后最后过滤c
>>> [k for k,v in c.items() if v>1]
>>> [803, 903, 703]
>>> from collections import Counter
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> c = Counter(el for t in ts for el in t)
>>> [k for k in c if c[k] >= 2]
[703, 803, 903]
这是一个解决方案,它仅通过一次而不是两次就可以解决它,并随其建立结果(不确定对于超大型ts
在实践中是更快还是更慢)
>>> from collections import Counter
>>> from itertools import chain
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> def find_common(ts):
... c = Counter()
... for num in chain.from_iterable(ts):
... c[num] += 1
... if c[num] == 2:
... yield num
...
>>> list(find_common(ts))
[703, 803, 903]
没有Counter
>>> def find_common(ts):
... seen, dupes = set(), set()
... for num in chain.from_iterable(ts):
... if num in seen and num not in dupes:
... dupes.add(num)
... yield num
... seen.add(num)
>>> list(find_common(ts))
[703, 803, 903]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.