简体   繁体   中英

finding item in two lists efficiently Python

I have a working function to allow me to search two lists and see if the items in list1 are present in list2, if an item is not present in list2 from list1 then I would like that output to another list. However this takes AGES to run and I was wondering if there is a faster way to do this.

def compare(list1,list2):
    x=[]
    for i in list1:
        if i not in list2:
            x.append(i)

    return x

Since you're only checking for inclusion in list2 , make that one a set:

def compare(list1, list2):
    s2 = set(list2)
    result = [i  for i in list1  if i not in s2] # simple list comprehension
    return result

This preserves the input order and semantics of your program, while being significantly faster: set inclusion testing is O(1), while list inclusion testing is O(n), bringing your whole algorithm from O(n^2) down to O(n).

You can use sets.

Example:

>>> list1 = ['a', 'b', 'c']
>>> list2 = ['c', 'd', 'e']
>>> set(list1) - set(list2)
{'b', 'a'}

Sets by default do not preserve order. If order is important for you, check the recipe for OrderedSet .

Can you use sets?

It would be like:

print list(set(list1)-set(list2))

But it will remove duplicates.

The set1 - set2 answers are correct iff you don't mind losing order and duplicates. But even if you do , using sets can vastly increase the efficiency and readability of the program if list2 is large:

set2 = set(list2)
result = [item for item in list1 if item not in set2]

The cost of generating the set is O(len(list2)) . So the runtime of this approach will be O(len(list2) + len(list1)) and makes a lot of sense unless list2 is very small.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM