简体   繁体   English

2个列表之间的共同元素比较

[英]Common elements comparison between 2 lists

def common_elements(list1, list2):
    """
    Return a list containing the elements which are in both list1 and list2

    >>> common_elements([1,2,3,4,5,6], [3,5,7,9])
    [3, 5]
    >>> common_elements(['this','this','n','that'],['this','not','that','that'])
    ['this', 'that']
    """
    for element in list1:
        if element in list2:
            return list(element)

Got that so far, but can't seem to get it to work!到目前为止,但似乎无法让它发挥作用!

Any ideas?有任何想法吗?

Use Python's set intersection :使用 Python 的set 交集

>>> list1 = [1,2,3,4,5,6]
>>> list2 = [3, 5, 7, 9]
>>> list(set(list1).intersection(list2))
[3, 5]

The solutions suggested by S.Mark and SilentGhost generally tell you how it should be done in a Pythonic way, but I thought you might also benefit from knowing why your solution doesn't work. S.MarkSilentGhost建议的解决方案通常会告诉您应该如何以 Python 方式完成,但我认为您也可能会从了解为什么您的解决方案不起作用中受益。 The problem is that as soon as you find the first common element in the two lists, you return that single element only.问题是,一旦您在两个列表中找到第一个公共元素,您就只返回该单个元素。 Your solution could be fixed by creating a result list and collecting the common elements in that list:您的解决方案可以通过创建result列表并收集该列表中的常见元素来修复:

def common_elements(list1, list2):
    result = []
    for element in list1:
        if element in list2:
            result.append(element)
    return result

An even shorter version using list comprehensions:使用列表推导的更短版本:

def common_elements(list1, list2):
    return [element for element in list1 if element in list2]

However, as I said, this is a very inefficient way of doing this -- Python's built-in set types are way more efficient as they are implemented in C internally.然而,正如我所说,这是一种非常低效的方法——Python 的内置集合类型效率更高,因为它们是在内部用 C 实现的。

You can also use sets and get the commonalities in one line: subtract the set containing the differences from one of the sets.您还可以使用集合并在一行中获得共性:从其中一个集合中减去包含差异的集合。

A = [1,2,3,4]
B = [2,4,7,8]
commonalities = set(A) - (set(A) - set(B))

You can solve this using numpy :您可以使用numpy解决此问题:

import numpy as np

list1 = [1, 2, 3, 4, 5, 6]
list2 = [3, 5, 7, 9]

common_elements = np.intersect1d(list1, list2)
print(common_elements)

common_elements will be the numpy array: [3 5] . common_elements将是 numpy 数组: [3 5]

use set intersections, set(list1) & set(list2)使用集合交点,集合(list1)和集合(list2)

>>> def common_elements(list1, list2):
...     return list(set(list1) & set(list2))
...
>>>
>>> common_elements([1,2,3,4,5,6], [3,5,7,9])
[3, 5]
>>>
>>> common_elements(['this','this','n','that'],['this','not','that','that'])
['this', 'that']
>>>
>>>

Note that result list could be different order with original list.请注意,结果列表可能与原始列表的顺序不同。

Set is another way we can solve this Set 是我们可以解决这个问题的另一种方法

a = [3,2,4]
b = [2,3,5]
set(a)&set(b)
{2, 3}
def common_elements(list1, list2):
    """
    Return a list containing the elements which are in both list1 and list2

    >>> common_elements([1,2,3,4,5,6], [3,5,7,9])
    [3, 5]
    >>> common_elements(['this','this','n','that'],['this','not','that','that'])
    ['this', 'that']
    """
    for element in list1:
        if element in list2:
            return list(element)

Got that so far, but can't seem to get it to work!到现在为止,但是似乎无法正常工作!

Any ideas?有任何想法吗?

The previous answers all work to find the unique common elements, but will fail to account for repeated items in the lists.前面的答案都可以找到唯一的共同元素,但无法解释列表中的重复项。 If you want the common elements to appear in the same number as they are found in common on the lists, you can use the following one-liner:如果您希望公共元素以与列表中相同的编号出现,您可以使用以下单行:

l2, common = l2[:], [ e for e in l1 if e in l2 and (l2.pop(l2.index(e)) or True)]

The or True part is only necessary if you expect any elements to evaluate to False .仅当您希望任何元素评估为False时, or True部分才是必需的。

I compared each of method that each answer mentioned.我比较了每个答案提到的每种方法。 At this moment I use python 3.6.3 for this implementation.目前我使用 python 3.6.3 来实现这个。 This is the code that I have used:这是我使用的代码:

import time
import random
from decimal import Decimal


def method1():
    common_elements = [x for x in li1_temp if x in li2_temp]
     print(len(common_elements))


def method2():
    common_elements = (x for x in li1_temp if x in li2_temp)
    print(len(list(common_elements)))


def method3():
    common_elements = set(li1_temp) & set(li2_temp)
    print(len(common_elements))


def method4():
    common_elements = set(li1_temp).intersection(li2_temp)
    print(len(common_elements))


if __name__ == "__main__":
    li1 = []
    li2 = []
    for i in range(100000):
        li1.append(random.randint(0, 10000))
        li2.append(random.randint(0, 10000))

    li1_temp = list(set(li1))
    li2_temp = list(set(li2))

    methods = [method1, method2, method3, method4]
    for m in methods:
        start = time.perf_counter()
        m()
        end = time.perf_counter()
        print(Decimal((end - start)))

If you run this code you can see that if you use list or generator(if you iterate over generator, not just use it. I did this when I forced generator to print length of it), you get nearly same performance.如果您运行此代码,您可以看到如果您使用列表或生成器(如果您迭代生成器,而不仅仅是使用它。我在强制生成器打印它的长度时这样做了),您将获得几乎相同的性能。 But if you use set you get much better performance.但是如果你使用 set 你会得到更好的性能。 Also if you use intersection method you will get a little bit better performance.此外,如果您使用交集方法,您将获得更好的性能。 the result of each method in my computer is listed bellow:我的计算机中每种方法的结果如下所示:

  1. method1: 0.8150673999999999974619413478649221360683441方法1:0.8150673999999999974619413478649221360683441
  2. method2: 0.8329545000000001531148541289439890533685684方法2:0.8329545000000001531148541289439890533685684
  3. method3: 0.0016547000000000089414697868051007390022277方法3:0.0016547000000000089414697868051007390022277
  4. method4: 0.0010262999999999244948867271887138485908508方法4:0.0010262999999999244948867271887138485908508
def common_elements(list1, list2):
    """
    Return a list containing the elements which are in both list1 and list2

    >>> common_elements([1,2,3,4,5,6], [3,5,7,9])
    [3, 5]
    >>> common_elements(['this','this','n','that'],['this','not','that','that'])
    ['this', 'that']
    """
    for element in list1:
        if element in list2:
            return list(element)

Got that so far, but can't seem to get it to work!到现在为止,但是似乎无法正常工作!

Any ideas?有任何想法吗?

def common_elements(list1, list2):
    """
    Return a list containing the elements which are in both list1 and list2

    >>> common_elements([1,2,3,4,5,6], [3,5,7,9])
    [3, 5]
    >>> common_elements(['this','this','n','that'],['this','not','that','that'])
    ['this', 'that']
    """
    for element in list1:
        if element in list2:
            return list(element)

Got that so far, but can't seem to get it to work!到现在为止,但是似乎无法正常工作!

Any ideas?有任何想法吗?

1) Method1 saving list1 is dictionary and then iterating each elem in list2 1)方法1保存list1是字典,然后迭代list2中的每个元素

def findarrayhash(a,b):
    h1={k:1 for k in a}
    for val in b:
        if val in h1:
            print("common found",val)
            del h1[val]
        else:
            print("different found",val)
    for key in h1.iterkeys():
        print ("different found",key)

Finding Common and Different elements:寻找共同和不同的元素:

2) Method2 using set 2)方法2使用集合

def findarrayset(a,b):
    common = set(a)&set(b)
    diff=set(a)^set(b)
    print list(common)
    print list(diff) 

Here's a rather brute force method that I came up with. 这是我提出的一种相当强力的方法。 It's certainly not the most efficient but it's something. 它当然不是最有效的,但它是一些东西。

The problem I found with some of the solutions here is that either it doesn't give repeated elements or it doesn't give the correct number of elements when the input order matters. 我在这里找到的一些解决方案的问题是,它既不会给出重复的元素,也不会在输入顺序重要时给出正确数量的元素。

#finds common elements
def common(list1, list2):
    result = []
    intersect = list(set(list1).intersection(list2))

    #using the intersection, find the min
    count1 = 0
    count2 = 0
    for i in intersect: 
        for j in list1:
            if i == j:
                count1 += 1
        for k in list2: 
            if i == k:
                count2 += 1
        minCount = min(count2,count1)
        count1 = 0
        count2 = 0

        #append common factor that many times
        for j in range(minCount):
            result.append(i)

    return result
a_list = range(1,10)
b_list = range(5, 25)
both = []

for i in b_list:
    for j in a_list:
        if i == j:
            both.append(i)
f_list=[1,2,3,4,5] # First list
s_list=[3,4,5,6,7,8] # Second list
# An empty list stores the common elements present in both the list
common_elements=[]

for i in f_list:
    # checking if each element of first list exists in second list
    if i in s_list:
        #if so add it in common elements list
        common_elements.append(i) 
print(common_elements)

Hi, this is my propose (very simple) 嗨,这是我的建议(非常简单)

import random

i = [1,4,10,22,44,6,12] #first random list, could be change in the future
j = [1,4,10,8,15,14] #second random list, could be change in the future
for x in i: 
    if x in j: #for any item 'x' from collection 'i', find the same item in collection of 'j'
        print(x) # print out the results
def common_member(a, b): 
    a_set = set(a) 
    b_set = set(b) 
    if (a_set & b_set): 
        print(a_set & b_set) 
    else: 
        print("No common elements") 
list_1=range(0,100)
list_2=range(0,100,5)
final_list=[]
for i in list_1:
    for j in list_2:
        if i==j:
            final_list.append(i)
print(set(final_list))

Your problem is that you're returning from inside the for loop so you'll only get the first match. 你的问题是你从for循环内部返回,所以你只能得到第一个匹配。 The solution is to move your return outside the loop. 解决方案是将您的回报移到循环之外。

def elementosEnComunEntre(lista1,lista2):

    elementosEnComun = set()

    for e1 in lista1:
         if(e1 in lista2):
             elementosEnComun.add(e1)

    return list(elementosEnComun)

There are solutions here that do it in O(l1+l2) that don't count repeating items, and slow solutions (at least O(l1*l2), but probably more expensive) that do consider repeating items.这里有一些解决方案在 O(l1+l2) 中不计算重复项,而慢速解决方案(至少 O(l1*l2),但可能更昂贵)考虑重复项。

So I figured I should add an O(l1*log(l1)+l2*(log(l2)) solution. This is particularly useful if the lists are already sorted.所以我想我应该添加一个 O(l1*log(l1)+l2*(log(l2)) 解决方案。如果列表已经排序,这将特别有用。

def common_elems_with_repeats(first_list, second_list):
    first_list = sorted(first_list)
    second_list = sorted(second_list)
    marker_first = 0
    marker_second = 0
    common = []
    while marker_first < len(first_list) and marker_second < len(second_list):
        if(first_list[marker_first] == second_list[marker_second]):
            common.append(first_list[marker_first])
            marker_first +=1
            marker_second +=1
        elif first_list[marker_first] > second_list[marker_second]:
            marker_second += 1
        else:
            marker_first += 1
    return common

Another faster solution would include making a item->count map from list1, and iterating through list2, while updating the map and counting dups.另一个更快的解决方案包括从 list1 制作 item->count 映射,并迭代 list2,同时更新映射和计数重复。 Wouldn't require sorting.不需要排序。 Would require extra a bit extra memory but it's technically O(l1+l2).需要额外的内存,但从技术上讲是 O(l1+l2)。

def list_common_elements(l_1,l_2,_unique=1,diff=0):
    if not diff:
        if _unique:
            return list(set(l_1)&set(l_2))
        if not _unique:
            return list((i for i in l_1 if i in l_2))
    if diff:
        if _unique:
            return list(set(l_1)^set(l_2))
        if not _unique:
            return list((i for i in l_1 if i not in l_2))
"""
Example:
l_1=             [0, 1, 2, 3, 3, 4, 5]
l_2=             [6, 7, 8, 8, 9, 5, 4, 3, 2]
look for diff
l_2,l_1,diff=1,_unique=1: [0, 1, 6, 7, 8, 9]        sorted(unique(L_2 not in L_1) + unique(L_1 not in L_2))
l_2,l_1,diff=1,_unique=0: [6, 7, 8, 8, 9]           L_2 not in L_1
l_1,l_2,diff=1,_unique=1: [0, 1, 6, 7, 8, 9]        sorted(unique(L_1 not in L_2) + unique(L_2 not in L_1))
l_1,l_2,diff=1,_unique=0: [0, 1]                    L_1 not in L_2
look for same
l_2,l_1,diff=0,_unique=1: [2, 3, 4, 5]              unique(L2 in L1)
l_2,l_1,diff=0,_unique=0: [5, 4, 3, 2]              L2 in L1
l_1,l_2,diff=0,_unique=1: [2, 3, 4, 5]              unique(L1 in L2)
l_1,l_2,diff=0,_unique=0: [2, 3, 3, 4, 5]           L1 in L2
"""

This function provides the ability to compare two lists (L_1 vs. L_2). 此功能提供了比较两个列表(L_1与L_2)的功能。 DIFF parameter set the comparison to search for common elements (DIFF==True) or different elements (DIFF==False) between both lists. DIFF参数设置比较以搜索两个列表之间的公共元素(DIFF == True)或不同元素(DIFF == False)。 On the next level the behavior of the method is set by _UNIQUE parameter. 在下一个级别,方法的行为由_UNIQUE参数设置。 _UNIQUE==True will use Python sets – in this case the method returns a sorted list of unique elements satisfying DIFF. _UNIQUE == True将使用Python集 - 在这种情况下,该方法返回满足DIFF的唯一元素的排序列表。 When _UNIQUE==False – the returned list is more explicit, ie it will first contain all elements of L_1 followed by all elements of L_2 satisfying the DIFF. 当_UNIQUE == False时 - 返回的列表更明确,即它将首先包含L_1的所有元素,然后是满足DIFF的L_2的所有元素。 Since the output will contain repetitive occurrences of elements in L_1 and L_2 satisfying DIFF, the user could post-count the number of times an element differs or is common between the lists. 由于输出将包含L_1和L_2中满足DIFF的元素的重复出现,因此用户可以对元素在列表之间不同或共同的次数进行后计数。 As this proposal is simply a compilation of code proposed by “cowlinator” and JS Method2, pleases see these authors posts for discussion on the speed and the performance of the calculation. 由于这个提议只是“cowlinator”和JS Method2提出的代码汇编,请看这些作者的帖子,讨论计算的速度和性能。 Credits cowlinator and JS Method2 Credits cowlinator和JS Method2

If list1 and list2 are unsorted:如果 list1 和 list2 未排序:

Using intersection:使用交集:

print((set(list1)).intersection(set(list2)))

Combining the lists and checking if occurrence of an element is more than 1:组合列表并检查元素的出现是否大于 1:

combined_list = list1 + list2
set([num for num in combined_list if combined_list.count(num) > 1])

Similar to above but without using set:与上面类似,但不使用 set:

for num in combined_list:
    if combined_list.count(num) > 1:
        print(num)
        combined_list.remove(num)

For sorted lists, without python special built ins, an O(n) solution对于排序列表,没有 python 特殊内置插件,O(n) 解决方案

p1 = 0
p2 = 0
result = []
while p1 < len(list1) and p2 < len(list2):
    if list1[p1] == list2[p2]:
        result.append(list1[p1])
        p1 += 1
        p2 += 2
    elif list1[p1] > list2[p2]:
        p2 += 1
    else:
        p1 += 1
print(result)

i have worked out a full solution for deep intersection我已经为深度交叉制定了一个完整的解决方案

def common_items_dict(d1, d2, use_set_for_list_commons=True, use_set_for_dict_key_commons=True, append_empty=False):
    result = {}
    if use_set_for_dict_key_commons:
        shared_keys=list(set(d1.keys()).intersection(d2.keys())) # faster, order not preserved
    else:
        shared_keys=common_items_list(d1.keys(), d2.keys(), use_set_for_list_commons=False)

    for k in  shared_keys:
        v1 = d1[k]
        v2 = d2[k]
        if isinstance(v1, dict) and isinstance(v2, dict):
            result_dict=common_items_dict(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
            if len(result_dict)>0 or append_empty:
                result[k] = result_dict 
        elif isinstance(v1, list) and isinstance(v2, list):
            result_list=common_items_list(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
            if len(result_list)>0 or append_empty:
                result[k] = result_list 
        elif v1 == v2:
            result[k] = v1
    return result

def common_items_list(d1, d2, use_set_for_list_commons=True, use_set_for_dict_key_commons=True, append_empty=False):
    if use_set_for_list_commons: 
        result_list= list(set(d2).intersection(d1)) # faster, order not preserved, support only simple data types in list values
        return result_list

    result = []
    for v1 in d1: 
        for v2 in d2:
            if isinstance(v1, dict) and isinstance(v2, dict):
                result_dict=common_items_dict(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
                if len(result_dict)>0 or append_empty:
                    result.append(result_dict)
            elif isinstance(v1, list) and isinstance(v2, list):
                result_list=common_items_list(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
                if len(result_list)>0 or append_empty:
                    result.append(result_list)
            elif v1 == v2:
                result.append(v1)
    return result


def deep_commons(v1,v2, use_set_for_list_commons=True, use_set_for_dict_key_commons=True, append_empty=False):
    """
    deep_commons
     returns intersection of items of dict and list combinations recursively

    this function is a starter function, 
    i.e. if you know that the initial input is always dict then you can use common_items_dict directly
    or if it is a list you can use common_items_list directly

    v1 - dict/list/simple_value
    v2 - dict/list/simple_value
    use_set_for_dict_key_commons - bool - using set is faster, dict key order is not preserved 
    use_set_for_list_commons - bool - using set is faster, list values order not preserved, support only simple data types in list values
    append_empty - bool - if there is a common key, but no common items in value of key , if True it keeps the key with an empty list of dict

    """

    if isinstance(v1, dict) and isinstance(v2, dict):
        return common_items_dict(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
    elif isinstance(v1, list) and isinstance(v2, list):
        return common_items_list(v1, v2, use_set_for_list_commons, use_set_for_dict_key_commons, append_empty)
    elif v1 == v2:
        return v1
    else:
        return None


needed_services={'group1':['item1','item2'],'group3':['item1','item2']}
needed_services2={'group1':['item1','item2'],'group3':['item1','item2']}

result=deep_commons(needed_services,needed_services2)

print(result)

Use a generator:使用生成器:

common = (x for x in list1 if x in list2)

The advantage here is that this will return in constant time (nearly instant) even when using huge lists or other huge iterables.这里的优点是即使使用巨大的列表或其他巨大的可迭代对象,它也会以恒定的时间(几乎是瞬间)返回。

For example,例如,

list1 =  list(range(0,10000000))
list2=list(range(1000,20000000))
common = (x for x in list1 if x in list2)

All other answers here will take a very long time with these values for list1 and list2.对于 list1 和 list2 的这些值,这里的所有其他答案都需要很长时间。

You can then iterate the answer with然后,您可以使用

for i in common: print(i)
list1=[123,324523,5432,311,23]
list2=[2343254,34234,234,322123,123,234,23]
common=[]
def common_elements(list1,list2):
    for x in range(0,len(list1)):
        if list1[x] in list2:
            common.append(list1[x])
            
common_elements(list1,list2)
print(common)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM