python 多线程等待所有线程完成

Question

这可能是在类似的情况下被问到的，但在搜索了大约 20 分钟后我找不到答案，所以我会问。

我写了一个 Python 脚本（比方说：scriptA.py）和一个脚本（比方说 scriptB.py）

在 scriptB 中，我想用不同的 arguments 多次调用 scriptA，每次运行大约需要一个小时，（它是一个巨大的脚本，做很多事情..不用担心）我希望能够运行scriptA 同时具有所有不同的 arguments，但我需要等到所有这些都完成后再继续； 我的代码：

import subprocess

#setup
do_setup()

#run scriptA
subprocess.call(scriptA + argumentsA)
subprocess.call(scriptA + argumentsB)
subprocess.call(scriptA + argumentsC)

#finish
do_finish()

我想同时运行所有subprocess.call() ，然后等到它们全部完成，我应该怎么做？

我试着像这里的例子一样使用线程：

from threading import Thread
import subprocess

def call_script(args)
    subprocess.call(args)

#run scriptA   
t1 = Thread(target=call_script, args=(scriptA + argumentsA))
t2 = Thread(target=call_script, args=(scriptA + argumentsB))
t3 = Thread(target=call_script, args=(scriptA + argumentsC))
t1.start()
t2.start()
t3.start()

但我认为这是不对的。

我怎么知道他们在去我的do_finish()之前都已经完成了运行？

Answer 1

将线程放在一个列表中，然后使用Join 方法

 threads = []

 t = Thread(...)
 threads.append(t)

 ...repeat as often as necessary...

 # Start all threads
 for x in threads:
     x.start()

 # Wait for all of them to finish
 for x in threads:
     x.join()

Answer 2

您需要在脚本末尾使用Thread对象的join方法。

t1 = Thread(target=call_script, args=(scriptA + argumentsA))
t2 = Thread(target=call_script, args=(scriptA + argumentsB))
t3 = Thread(target=call_script, args=(scriptA + argumentsC))

t1.start()
t2.start()
t3.start()

t1.join()
t2.join()
t3.join()

因此主线程将等待t1 、 t2和t3完成执行。

Answer 3

在 Python3 中，由于 Python 3.2 有一种新方法可以达到相同的结果，我个人更喜欢传统的线程创建/启动/加入，包concurrent.futures ： https : //docs.python.org/3/library/并发.futures.html

使用ThreadPoolExecutor代码将是：

from concurrent.futures.thread import ThreadPoolExecutor
import time

def call_script(ordinal, arg):
    print('Thread', ordinal, 'argument:', arg)
    time.sleep(2)
    print('Thread', ordinal, 'Finished')

args = ['argumentsA', 'argumentsB', 'argumentsC']

with ThreadPoolExecutor(max_workers=2) as executor:
    ordinal = 1
    for arg in args:
        executor.submit(call_script, ordinal, arg)
        ordinal += 1
print('All tasks has been finished')

前面代码的输出类似于：

Thread 1 argument: argumentsA
Thread 2 argument: argumentsB
Thread 1 Finished
Thread 2 Finished
Thread 3 argument: argumentsC
Thread 3 Finished
All tasks has been finished

优点之一是您可以控制设置最大并发工作人员的吞吐量。

Answer 4

我更喜欢使用基于输入列表的列表理解：

inputs = [scriptA + argumentsA, scriptA + argumentsB, ...]
threads = [Thread(target=call_script, args=(i)) for i in inputs]
[t.start() for t in threads]
[t.join() for t in threads]

Answer 5

你可以有类似下面的类，从中你可以添加'n'个你想要并行执行的函数或console_scripts并开始执行并等待所有作业完成..

from multiprocessing import Process

class ProcessParallel(object):
    """
    To Process the  functions parallely

    """    
    def __init__(self, *jobs):
        """
        """
        self.jobs = jobs
        self.processes = []

    def fork_processes(self):
        """
        Creates the process objects for given function deligates
        """
        for job in self.jobs:
            proc  = Process(target=job)
            self.processes.append(proc)

    def start_all(self):
        """
        Starts the functions process all together.
        """
        for proc in self.processes:
            proc.start()

    def join_all(self):
        """
        Waits untill all the functions executed.
        """
        for proc in self.processes:
            proc.join()


def two_sum(a=2, b=2):
    return a + b

def multiply(a=2, b=2):
    return a * b


#How to run:
if __name__ == '__main__':
    #note: two_sum, multiply can be replace with any python console scripts which
    #you wanted to run parallel..
    procs =  ProcessParallel(two_sum, multiply)
    #Add all the process in list
    procs.fork_processes()
    #starts  process execution 
    procs.start_all()
    #wait until all the process got executed
    procs.join_all()

Answer 6

来自threading模块文档

有一个“主线程”对象； 这对应于 Python 程序中的初始控制线程。 它不是守护线程。

有可能创建“虚拟线程对象”。 这些是与“外来线程”相对应的线程对象，它们是在线程模块之外启动的控制线程，例如直接从 C 代码启动。 虚拟线程对象的功能有限； 它们总是被认为是活着的和守护进程的，并且不能被join() ed。 它们永远不会被删除，因为不可能检测到外来线程的终止。

因此，当您不想保留您创建的线程列表时，要捕获这两种情况：

import threading as thrd


def alter_data(data, index):
    data[index] *= 2


data = [0, 2, 6, 20]

for i, value in enumerate(data):
    thrd.Thread(target=alter_data, args=[data, i]).start()

for thread in thrd.enumerate():
    if thread.daemon:
        continue
    try:
        thread.join()
    except RuntimeError as err:
        if 'cannot join current thread' in err.args[0]:
            # catchs main thread
            continue
        else:
            raise

于是：

>>> print(data)
[0, 4, 12, 40]

Answer 7

也许，像

for t in threading.enumerate():
    if t.daemon:
        t.join()

Answer 8

我刚刚遇到了同样的问题，我需要等待使用 for 循环创建的所有线程。我只是尝试了以下代码。这可能不是完美的解决方案，但我认为这将是一个简单的解决方案去测试：

for t in threading.enumerate():
    try:
        t.join()
    except RuntimeError as err:
        if 'cannot join current thread' in err:
            continue
        else:
            raise

Answer 9

仅使用 join 会导致与线程的误报交互。 就像文档中说的：

当存在超时参数而不是 None 时，它应该是一个浮点数，以秒（或其分数）为单位指定操作超时。 由于 join() 始终返回 None，因此您必须在 join() 之后调用 isAlive() 来确定是否发生超时——如果线程仍然存在，则 join() 调用超时。

和说明性代码：

threads = []
for name in some_data:
    new = threading.Thread(
        target=self.some_func,
        args=(name,)
    )
    threads.append(new)
    new.start()
    
over_threads = iter(threads)
curr_th = next(over_threads)
while True:
    curr_th.join()
    if curr_th.is_alive():
        continue
    try:
        curr_th = next(over_threads)
    except StopIteration:
        break

python 多线程等待所有线程完成

问题描述

9 个解决方案

解决方案1
195 2012-08-15 12:00:03

解决方案2
179 已采纳 2012-08-15 11:54:27

解决方案3
37 2016-05-20 08:02:14

解决方案4
32 2015-08-05 11:19:32

解决方案5
6 2013-04-30 15:07:45

解决方案6
3 2018-07-10 09:55:24

解决方案7
2 2017-06-06 12:31:22

解决方案8
2 2018-03-14 13:35:32

解决方案9
0 2022-05-10 01:20:07

python 多线程等待所有线程完成

问题描述

9 个解决方案

解决方案1 195 2012-08-15 12:00:03

解决方案2 179 已采纳 2012-08-15 11:54:27

解决方案3 37 2016-05-20 08:02:14

解决方案4 32 2015-08-05 11:19:32

解决方案5 6 2013-04-30 15:07:45

解决方案6 3 2018-07-10 09:55:24

解决方案7 2 2017-06-06 12:31:22

解决方案8 2 2018-03-14 13:35:32

解决方案9 0 2022-05-10 01:20:07

解决方案1
195 2012-08-15 12:00:03

解决方案2
179 已采纳 2012-08-15 11:54:27

解决方案3
37 2016-05-20 08:02:14

解决方案4
32 2015-08-05 11:19:32

解决方案5
6 2013-04-30 15:07:45

解决方案6
3 2018-07-10 09:55:24

解决方案7
2 2017-06-06 12:31:22

解决方案8
2 2018-03-14 13:35:32

解决方案9
0 2022-05-10 01:20:07