简体   繁体   中英

How do the set operands ( | , & , - ,..etc) work so fast?

I am a newbie to python and coding, and I was writing this script where I wanted to get the intersection (matched items) in two very long lists, each over 200000 item. I used two for loops as follows:

for x in list1:
    for y in list2:
        print(y)

But it took a run-time over one hour to get the desired result. I got an idea about turning the two lists into sets, and then using the intersection operand & on them, and it worked perfectly, the run-time was reduced to only three seconds.

My question is: how is that possible? How do these operands work so fast? Don't they also need to iterate through all the items in both sets?

People in comments have mentioned that as a data structure, sets are better than lists for certain kinds of operations, including checking set membership. Checking membership of a set is O(1), meaning the time it takes doesn't matter how many elements the set has. Checking membership of a list is O(n), scaling linearly with the number of elements. The Big O Cheatsheet is a good reference for choosing data structures.

Another reason your code may be slow is that your intersection is done with Python for loops, which are interpreted, with every step executed as written*. When you call set.intersection , this is handled by the interpreter's built-in set implementation, which in CPython is written in C , and compiled to machine code. This means that for loops used in this C implementation are much faster than yours written in Python, and also provides the C compiler the opportunity to apply additional optimisations that would be impossible with interpreted code.

* I'm not sure if Python's for loop bodies are translated into an optimised intermediate representation, though such an IR would not be comparable to C optimisations, particularly when your loop is not referentially transparent.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM