简体   繁体   English

试图了解python多线程

[英]Trying to understand python multithreading

Please consider this code: 请考虑以下代码:

import threading

def printer():
    for i in range(2):
        with lock:
            print ['foo', 'bar', 'baz']

def main():
    global lock
    lock = threading.Lock()
    threads = [threading.Thread(target=printer) for x in xrange(2)]
    for t in threads:
        t.start()
        t.join()

main()

I can understand this code and it is clear: We create two threads and we run them sequentially - we run second thread only when first thread is finished. 我可以理解这段代码,这很清楚:我们创建了两个线程,然后按顺序运行它们-仅在第一个线程完成时才运行第二个线程。 Ok, now consider another variant: 好的,现在考虑另一个变体:

import threading

def printer():
    for i in range(2):
        with lock:
            print ['foo', 'bar', 'baz']

def main():
    global lock
    lock = threading.Lock()
    threads = [threading.Thread(target=printer) for x in xrange(2)]
    for t in threads:
        t.start()
    for t in threads:
        t.join()

main()

What happens here? 这里会发生什么? Ok, we run them in parallel, but what is the purpose of make main thread waiting for child threads in second variant? 好的,我们并行运行它们,但是让主线程在第二个变体中等待子线程的目的是什么? How it can influence on the output? 它如何影响输出?

In the second variant, the ordering of execution is much less defined. 在第二种变体中,执行顺序的定义要少得多。 The lock is released each time through the loop in printer. 每次通过打印机中的循环释放锁定。 In both variants, you have two threads and two loops within a thread. 在这两种变体中,一个线程中都有两个线程和两个循环。 In the first variant, since only one thread runs at a time, you know the total ordering. 在第一个变体中,由于一次只运行一个线程,因此您知道总排序。 In the second variant, each time the lock is released, the thread running may change. 在第二个变体中,每次释放锁定时,线程运行可能会更改。 So you might get 所以你可能会得到

  • thread 1 loop 1 线程1循环1
  • thread 1 loop 2 线程1循环2
  • thread 2 loop 1 线程2循环1
  • thread 2 loop 2 线程2循环2

or perhaps * thread 2 loop 1 * thread 1 loop 1 * thread 1 loop 2 * thread 2 loop 2 或者*线程2循环1 *线程1循环1 *线程1循环2 *线程2循环2

The only constraint is that loop1 within a given thread runs before loop 2 within that thread and that the two print statements come together since the lock is held for both of them. 唯一的限制是给定线程中的loop1在该线程中的循环2之前运行,并且两个print语句放在一起,因为这两个语句均被锁定。

In this particular case I'm not sure the call to t.join() in the second variant has an observable effect. 在这种特殊情况下,我不确定第二个变体中对t.join()的调用是否具有可观察到的效果。 It guarantees that the main thread will be the last thread to end, but I'm not sure that in this code you can observe that in any way. 它保证了主线程将是最后一个结束的线程,但是我不确定在此代码中您能否以任何方式观察到这一点。 In more complex code, joining the threads can be important so that cleanup actions are only performed after all threads terminate. 在更复杂的代码中,加入线程可能很重要,因此仅在所有线程终止后才执行清除操作。 This can also be very important if you have daemon threads, because the entire program will terminate when all non-daemon threads terminate. 如果您有守护程序线程,这也可能非常重要,因为当所有非守护程序线程终止时,整个程序将终止。

To better understand the multithreading in python, you need to first understand the relationship between the main thread and the children threads. 为了更好地理解python中的多线程,您需要首先了解main线程和children线程之间的关系。

The main thread is the entry of the program, it is created by your system when you run your script. main线程是程序的入口,它是由系统在运行脚本时创建的。 For example, in your script, the main function is run in the main thread. 例如,在您的脚本中, main函数在main线程中运行。

While the children thread is created by your main thread when you instanate the Thread class. 当您实例化Thread类时, children线程是由您的main线程创建的。

The most important thing is how the main thread controls the children thread. 最重要的是主线程如何控制子线程。 Basically, the instance of the Thread is everything that the main thread know about and control over this child thread. 基本上, Thread实例是main线程了解并控制该子线程的所有内容。 At the time when a child thread is created, this child thread does not run immediately, until the main thread call start function on this thread instance. 在创建子线程时,该子线程不会立即运行,直到该线程实例上的主线程调用start函数为止。 After the start the child thread, you can assume that the main thread and the child thread is running parallelly now. 启动子线程之后,您可以假定main线程和child线程现在正在并行运行。

But one more important thing is how the main thread knows that the task of child thread is done. 但是更重要的一件事是main线程如何知道child线程的任务已完成。 Though the main thread knows nothing about how the task is done by the child thread, it does be aware of the running status of the child thread. 虽然main线程一无所知任务如何通过做child的线程,它不知道的运行状态child线程。 Thread.is_alive can check the status of a thread by the main thread. Thread.is_alive可以通过main线程检查Thread.is_alive的状态。 In pratice, the Thread.join function is always used to tell the main thread wait until the child thread is done. 在实践中,始终使用Thread.join函数告诉main线程等待,直到child线程完成。 This function will block the main thread. 该函数将阻塞main线程。

Okay, let's examine the two script you are confused with. 好的,让我们检查一下您困惑的两个脚本。 For the first script: 对于第一个脚本:

for t in threads:
    t.start()
    t.join()

The children threads in the loop are start ed and then join ed one by one. 循环中的children线程start ,然后一个接一个地join Note that start does not block main thread, while join will block the main thread wait until this child thread is done. 请注意, start不会阻塞main线程,而join会阻塞main线程,直到该child线程完成。 Thus they are running sequentially. 因此,它们按顺序运行。

While for the second script: 而对于第二个脚本:

for t in threads:
    t.start()
for t in threads:
    t.join()

All children threads are started in the first loop. 所有children线程均在第一个循环中启动。 As the Thread.start function will not block the main thread, all children threadings are running parallelly after the first loop. 由于Thread.start函数不会阻塞main线程,因此所有children线程在第一个循环之后并行运行。 In the second loop, the main thread will wait for the task done of each child thread one by one. 在第二个循环中, main线程将逐个等待每个child线程完成的任务。

Now I think you should notice the difference between these two script: in the first one, children threads running one by one, while in the second script, they are running simultaneously. 现在,我认为您应该注意到这两个脚本之间的区别:在第一个脚本中, children线程一个接一个地运行,而在第二个脚本中,它们同时运行。

There are other useful topics for the python threading: python线程还有其他有用的主题:

(1) How to handle the Keyboard Interrupt Exception, eg, when I want to terminate the program by Ctrl-C ? (1)如何处理键盘中断异常,例如,当我想通过Ctrl-C终止程序时? Only the main thread will receive the exception, you have to handle the termination of children threads. 只有main线程会收到异常,您必须处理children线程的终止。

(2) Multithreading vs Multiprocessing. (2)多线程与多处理。 Although we are saying that threading is parallel, it is not the real parallel in CPU level. 尽管我们说线程是并行的,但在CPU级别并不是真正的并行。 So if your application is CPU intensive, try multiprocessing, and if your application is I/O intensive, multithreading maybe sufficient. 因此,如果您的应用程序占用大量CPU,请尝试进行多处理,如果您的应用程序占用大量I / O,则多线程可能就足够了。

By the way, read through the documentation of python threading section and try some code may help you understand it. 顺便说一句,请仔细阅读python线程部分的文档并尝试一些代码可以帮助您理解它。

Hope this would be helpful. 希望这会有所帮助。 Thanks. 谢谢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM