简体   繁体   中英

Optimal check if the elements of a list are in another list in python

I need to check if the items in one list are in another list. Both lists contain paths to files.

    list1 = [a/b/c/file1.txt, b/c/d/file2.txt]
    list2 = [a/b/c/file1.txt, b/c/d/file2.txt, d/f/g/test4.txt, d/k/test5.txt]

I tried something like:

    len1 = len(list1)
    len2 = len(list2)

    res = list(set(list2) - set(list1))
    len3 = len(res)

    if len2 - len1 == len3:
        print("List2 contains all the items in list1")

But it's not an optimal option, I have lists of 50k+ items. I think a good solution can be by creating a hash table, but I don't know exactly how I could build it. If you have any suggestions you can leave a message.

Python set s are based on hashing, hence you cannot put unhashable objects inside set s. Rather calculating lengths, directly perform set difference :

>>> list1 = ['a/b/c/file1.txt', 'b/c/d/file2.txt']
>>> list2 = ['a/b/c/file1.txt', 'b/c/d/file2.txt', 'd/f/g/test4.txt', 'd/k/test5.txt']
>>> if (set(list1) - set(list2)):  # will return empty set (Falsy) if all are contained
        print("List2 contains all the items in list1")

List2 contains all the items in list1

Here is the breakdown:

>>> difference = set(list1) - set(list2)
>>> difference
set()
>>> bool(difference)
False

I think a good solution can be by creating a hash table, but I don't know exactly how I could build it.

Sets are already implemented using hash tables , so you are already doing that.

Supposing you don't have (or don't care about) duplicates, you could try:

list1 = [1,2,3]
list2 = [1,2,3,4]
set(list1).issubset(list2)

Notice how there's no need to convert list2 to a set, see the comments on this answer .

EDIT: both your solution and mine are O(n) average, it won't get faster than that. But your solution could avoid some operations like converting the difference res into a list just to get its size.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM