簡體   English   中英

查找兩個嵌套列表的交集?

[英]Find intersection of two nested lists?

我知道如何獲得兩個平面列表的交集:

b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
b3 = [val for val in b1 if val in b2]

要么

def intersect(a, b):
    return list(set(a) & set(b))
 
print intersect(b1, b2)

但是當我必須為嵌套列表找到交集時,我的問題就開始了:

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

最后我想收到:

c3 = [[13,32],[7,13,28],[1,6]]

你們能幫我一下嗎?

有關的

您不需要定義交集。 已經是布景的一流部分了。

>>> b1 = [1,2,3,4,5,9,11,15]
>>> b2 = [4,5,6,7,8]
>>> set(b1).intersection(b2)
set([4, 5])

如果你想:

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
c3 = [[13, 32], [7, 13, 28], [1,6]]

那么這是 Python 2 的解決方案:

c3 = [filter(lambda x: x in c1, sublist) for sublist in c2]

在 Python 3 中, filter返回一個可迭代對象而不是list ,因此您需要使用list()包裝filter調用:

c3 = [list(filter(lambda x: x in c1, sublist)) for sublist in c2]

解釋:

過濾器部分獲取每個子列表的項目並檢查它是否在源列表 c1 中。 對 c2 中的每個子列表執行列表理解。

對於只想找到兩個列表的交集的人,Asker 提供了兩種方法:

 b1 = [1,2,3,4,5,9,11,15] b2 = [4,5,6,7,8] b3 = [val for val in b1 if val in b2]

def intersect(a, b): return list(set(a) & set(b)) print intersect(b1, b2)

但是有一種混合方法更有效,因為你只需要在列表/集合之間進行一次轉換,而不是三個:

b1 = [1,2,3,4,5]
b2 = [3,4,5,6]
s2 = set(b2)
b3 = [val for val in b1 if val in s2]

這將在 O(n) 中運行,而他涉及列表理解的原始方法將在 O(n^2) 中運行

功能方法:

input_list = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]

result = reduce(set.intersection, map(set, input_list))

它可以應用於 1+ 列表的更一般情況

純列表理解版

>>> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
>>> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
>>> c1set = frozenset(c1)

展平變體:

>>> [n for lst in c2 for n in lst if n in c1set]
[13, 32, 7, 13, 28, 1, 6]

嵌套變體:

>>> [[n for n in lst if n in c1set] for lst in c2]
[[13, 32], [7, 13, 28], [1, 6]]

& 運算符取兩個集合的交集。

{1, 2, 3} & {2, 3, 4}
Out[1]: {2, 3}

獲取 2 個列表的交集的 pythonic 方法是:

[x for x in list1 if x in list2]

您應該使用此代碼(取自http://kogs-www.informatik.uni-hamburg.de/~meine/python_tricks )進行展平,該代碼未經測試,但我很確定它可以工作:


def flatten(x):
    """flatten(sequence) -> list

    Returns a single, flat list which contains all elements retrieved
    from the sequence and all recursively contained sub-sequences
    (iterables).

    Examples:
    >>> [1, 2, [3,4], (5,6)]
    [1, 2, [3, 4], (5, 6)]
    >>> flatten([[[1,2,3], (42,None)], [4,5], [6], 7, MyVector(8,9,10)])
    [1, 2, 3, 42, None, 4, 5, 6, 7, 8, 9, 10]"""

    result = []
    for el in x:
        #if isinstance(el, (list, tuple)):
        if hasattr(el, "__iter__") and not isinstance(el, basestring):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

展平列表后,以通常的方式執行交集:


c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

def intersect(a, b):
     return list(set(a) & set(b))

print intersect(flatten(c1), flatten(c2))

由於定義了intersect ,基本的列表理解就足夠了:

>>> c3 = [intersect(c1, i) for i in c2]
>>> c3
[[32, 13], [28, 13, 7], [1, 6]]

由於 S. Lott 的評論和 TM 的相關評論而得到改進:

>>> c3 = [list(set(c1).intersection(i)) for i in c2]
>>> c3
[[32, 13], [28, 13, 7], [1, 6]]

鑒於:

> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]

> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

我發現下面的代碼運行良好,如果使用 set 操作可能更簡潔:

> c3 = [list(set(f)&set(c1)) for f in c2] 

它得到了:

> [[32, 13], [28, 13, 7], [1, 6]]

如果需要訂購:

> c3 = [sorted(list(set(f)&set(c1))) for f in c2] 

我們有:

> [[13, 32], [7, 13, 28], [1, 6]]

順便說一下,對於更多 python 樣式,這個也可以:

> c3 = [ [i for i in set(f) if i in c1] for f in c2]

我不知道我是否遲遲不能回答你的問題。 閱讀您的問題后,我想出了一個 function intersect() 可以在列表和嵌套列表上工作。 我用遞歸定義了這個function,很直觀。 希望這是你要找的:

def intersect(a, b):
    result=[]
    for i in b:
        if isinstance(i,list):
            result.append(intersect(a,i))
        else:
            if i in a:
                 result.append(i)
    return result

例子:

>>> c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
>>> c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
>>> print intersect(c1,c2)
[[13, 32], [7, 13, 28], [1, 6]]

>>> b1 = [1,2,3,4,5,9,11,15]
>>> b2 = [4,5,6,7,8]
>>> print intersect(b1,b2)
[4, 5]

你認為[1,2][1, [2]]相交嗎? 也就是說,它只是您關心的數字,還是列表結構?

如果只有數字,研究如何“展平”列表,然后使用set()方法。

我也一直在尋找一種方法來做到這一點,最終結果是這樣的:

def compareLists(a,b):
    removed = [x for x in a if x not in b]
    added = [x for x in b if x not in a]
    overlap = [x for x in a if x in b]
    return [removed,added,overlap]

要定義正確考慮元素基數的交集,請使用Counter

from collections import Counter

>>> c1 = [1, 2, 2, 3, 4, 4, 4]
>>> c2 = [1, 2, 4, 4, 4, 4, 5]
>>> list((Counter(c1) & Counter(c2)).elements())
[1, 2, 4, 4, 4]
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]

c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

c3 = [list(set(c2[i]).intersection(set(c1))) for i in xrange(len(c2))]

c3
->[[32, 13], [28, 13, 7], [1, 6]]

我們可以為此使用 set 方法:

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]

   result = [] 
   for li in c2:
       res = set(li) & set(c1)
       result.append(list(res))

   print result
# Problem:  Given c1 and c2:
c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
# how do you get c3 to be [[13, 32], [7, 13, 28], [1, 6]] ?

這是設置不涉及集合的c3的一種方法:

c3 = []
for sublist in c2:
    c3.append([val for val in c1 if val in sublist])

但是如果你更喜歡只使用一行,你可以這樣做:

c3 = [[val for val in c1 if val in sublist]  for sublist in c2]

它是列表推導中的列表推導,這有點不尋常,但我認為您理解它應該不會有太多麻煩。

c1 = [1, 6, 7, 10, 13, 28, 32, 41, 58, 63]
c2 = [[13, 17, 18, 21, 32], [7, 11, 13, 14, 28], [1, 5, 6, 8, 15, 16]]
c3 = [list(set(i) & set(c1)) for i in c2]
c3
[[32, 13], [28, 13, 7], [1, 6]]

對我來說,這是一種非常優雅和快速的方法:)

平面列表可以通過reduce輕松制作。

您只需要使用初始化程序- reduce function 中的第三個參數。

reduce(
   lambda result, _list: result.append(
       list(set(_list)&set(c1)) 
     ) or result, 
   c2, 
   [])

上面的代碼適用於 python2 和 python3,但您需要導入 reduce 模塊,如from functools import reduce 有關詳細信息,請參閱以下鏈接。

查找可迭代對象之間差異和交集的簡單方法

如果重復很重要,請使用此方法

from collections import Counter

def intersection(a, b):
    """
    Find the intersection of two iterables

    >>> intersection((1,2,3), (2,3,4))
    (2, 3)

    >>> intersection((1,2,3,3), (2,3,3,4))
    (2, 3, 3)

    >>> intersection((1,2,3,3), (2,3,4,4))
    (2, 3)

    >>> intersection((1,2,3,3), (2,3,4,4))
    (2, 3)
    """
    return tuple(n for n, count in (Counter(a) & Counter(b)).items() for _ in range(count))

def difference(a, b):
    """
    Find the symmetric difference of two iterables

    >>> difference((1,2,3), (2,3,4))
    (1, 4)

    >>> difference((1,2,3,3), (2,3,4))
    (1, 3, 4)

    >>> difference((1,2,3,3), (2,3,4,4))
    (1, 3, 4, 4)
    """
    diff = lambda x, y: tuple(n for n, count in (Counter(x) - Counter(y)).items() for _ in range(count))
    return diff(a, b) + diff(b, a)
from random import *

a = sample(range(0, 1000), 100)
b = sample(range(0, 1000), 100)
print(a)
print(b)
print(set(a).intersection(set(b)))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM