简体   繁体   中英

Comparing two lists - Python

Okay, I have two lists, List 1 and List 2. I want to find all of the items that are in both list 1 and list 2, and remove them from list 1. The first way I've thought about doing this is looping through list 1 and then looping through list 2 to see if it is in list 2 but that seems slow and inefficient when scaled up. Is there a more efficient way of doing this?

Also, these lists will be ordered alphabetically (they're strings), if that helps anything.

I'm using python, but I'm also wondering from a general programming perspective.

list1 = ['bar','foo','hello','hi']
list2 = ['alpha','bar','hello','xam']

list1 would become ['foo','hi']

In python, you'd probably want to use a set:

intersection = set(list1).intersection(list2)

This will return a set which destroys the order (among other things), but you can always use that set to filter list1 afterward:

list1 = [x for x in list1 if x not in intersection]

The intersection is most useful if you actually want to use the set. As pointed out in the comments, it's not actually necessary if you don't want a set at all:

set2 = set(list2)
list1 = [x for x in list1 if x not in set2]

Use a set to get the difference between the two:

list1 = ['bar','foo','hello','hi']
list2 = ['alpha','bar','hello','xam']

set1 = set(list1)
set2 = set(list2)
set1 - set2

Outputs:

set(['hi', 'foo'])

As noted by @chepner, using set.difference, only the first needs to be converted to a set

set1.difference(list2)

If order is important, make one of them a set, and compare the other against it:

set2 = set(list2)
[x for x in list1 if x not in set2]

Outputs:

['foo', 'hi']

Here's a solution using a general programming approach, not using sets, and not particularly optimized. It relies on the two lists being sorted.

list1 = ['a', 'b', 'd', 'f', 'k']
list2 = ['c', 'd', 'i']
result = []

i1 = 0
i2 = 0
while i1 < len(list1) and i2 < len(list2):
    # invariants:
    #    list1[i1] not in list2[:i2], and
    #    result == (list1[:i1] with elements of list2[:i2] omitted)
    #
    if list1[i1] < list2[i2]:
        # By assumption, list1[i1] not in list2[:i2],
        # and because list2 is sorted, the true 'if' condition
        # implies that list1[i1] isn't in list2[i2:] either;
        # that is, it isn't in list2 at all.
        result.append(list1[i1])
        i1 += 1
    elif list1[i1] > list2[i2]:
        # can't decide membership of list1[i1] yet;
        # advance to next element of list2 and loop again
        i2 += 1
    else:
        # list1[i1] == list2[i2], so omit this element
        i1 += 1
        i2 += 1

# Add any remaining elements of list1 to tail of result
if i1 < len(list1):
    result.extend(list1[i1:])

print(result)

Result: ['a', 'b', 'f', 'k']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM