繁体   English   中英

在python列表中查找元组之间的公共元素

[英]Finding common elements between tuples in a list in python

如果我在python中有ts元组列表:

ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]

如何获取包含两个或多个此类元组之间的公共元素的列表?

假设ts的元组和元组中的元素都已经进行了数字排序。

对于此示例,预期输出应为:

ts_output = [703, 803, 903]

以下是我到目前为止的工作:

ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
ts = set(ts)

t1 = set(w for w,x in ts for y,z in ts if w == y) # t1 should only contain 803
print("t1: ", t1)

t2 = set(y for w,x in ts for y,z in ts if x == y) # t2 should only contain 703
print("t2: ", t2)

t3 = set(x for w,x in ts for y,z in ts if x == z) # t3 should only contain 903
print("t3: ", t3)

这是相应的输出:

t1: {803, 901, 902, 702, 703}
t2: {703}
t3: {704, 805, 806, 903, 703}

从上面看,只有t2给出了预期的输出,但是我不确定t1t3发生了什么。

您可以使用此替代输入来测试您的代码,并且应提供完全相同的输出:

 ts = [(701,703), (702,703), (703,704), (803,805), (803,806), (901,903), (902,903), (903,904)]
import collections

ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
flat_list = [item for sublist in ts for item in sublist]
duplicates = [item for item, count in collections.Counter(flat_list).items() if count > 1]
print(duplicates)

说明:

有了您的输入,您首先需要整理列表。

#1 Simple and pythonic
flat_list = [item
                for sublist in ts
                    for item in sublist]

#2 More efficient.
import itertools
flat_list = itertools.chain.from_iterable(ts)

在方法#1的情况下,您的flat_list将是list对象,在方法#2的情况下,它将是generator对象。 两者的迭代行为相同。

现在,您可以计算flat_list中的元素了。 如果它们大于1,则它们是重复项。

for item, count in collections.Counter(flat_list).items():
    if count > 1:
        print(item)

或者您可以使用更多的pythonic列表理解。

duplicates = [item
                 for item, count in collections.Counter(flat_list).items()
                     if count > 1]

您需要弄平元组列表。 您可以使用itertools.chain

>>> from itertools import chain

>>> flat_list = list(chain(*ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]

或者,您也可以使用itertools.chain.from_iterablesitertools.chain.from_iterables的事情,但是这不需要可迭代的拆包

>>> flat_list = list(itertools.chain.from_iterable(ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]

完成此步骤后,您可以使用Collections.Counter对平面列表中每个元素的出现进行计数,并过滤一次出现的多个元素。

>>> from collections import Counter
>>> c = Counter(flat_list)
>>> c
>>> Counter({803: 2, 903: 2, 703: 2, 704: 1, 805: 1, 806: 1, 901: 1, 902: 1, 702: 1}) 

然后最后过滤c

>>> [k for k,v in c.items() if v>1]
>>> [803, 903, 703]
>>> from collections import Counter
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> c = Counter(el for t in ts for el in t)
>>> [k for k in c if c[k] >= 2]
[703, 803, 903]

这是一个解决方案,它仅通过一次而不是两次就可以解决它,并随其建立结果(不确定对于超大型ts在实践中是更快还是更慢)

>>> from collections import Counter
>>> from itertools import chain
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> def find_common(ts):
...   c = Counter()
...   for num in chain.from_iterable(ts):
...     c[num] += 1
...     if c[num] == 2:
...       yield num
... 
>>> list(find_common(ts))
[703, 803, 903]

没有Counter

>>> def find_common(ts):
...   seen, dupes = set(), set()
...   for num in chain.from_iterable(ts):
...     if num in seen and num not in dupes:
...       dupes.add(num)
...       yield num
...     seen.add(num)
>>> list(find_common(ts))
[703, 803, 903]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM