简体   繁体   中英

Understanding python GIL - I/O bound vs CPU bound

From python threading documentation

In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.

Now I have a thread worker like this

def worker(queue):
    queue_full = True
    while queue_full:
        try:
            url = queue.get(False)
            w = Wappalyzer(url)
            w.analyze()
            queue.task_done()

        except Queue.Empty:
            queue_full = False

Here w.analyze() doing two things

  1. Scrape the url using requests library
  2. Analyzing the scraped html using pyv8 javascript library

As far as I know, 1 is I/O bound and 2 is CPU bound.

Does that mean, GIL applied for 2 and my program won't work properly?

The GIL description does not say anything about correctness, only about efficiency.

If 2 is CPU bound, you will not be able to get multicore performance out of threading, but your program will still perform correctly .

If you care about CPU Parallelism, you should use Python's multiprocessing library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM