简体   繁体   English

Python多线程函数参数

[英]Python multithreading function argument

I was writing some multithreading code and had a syntax issue in my code and found that the code was not executing in parallel but rather sequentially. 我正在编写一些多线程代码,但是我的代码中存在语法问题,因此发现该代码不是并行执行而是顺序执行。 I fixed the issue to pass the arguments to the function as a separate list instead of passing it as a parameter to the function but I couldn't figure out why python was behaving that way and couldn't find documentation for it. 我解决了将参数作为单独的列表传递给函数的问题,而不是将参数作为参数传递给函数,但我无法弄清楚为什么python表现出这种方式并且找不到相关的文档。 Anyone know why? 有人知道为什么吗?

import time
from concurrent.futures import ThreadPoolExecutor

def do_work(i):
    print("{} {} - Command started".format(i, time.time()))
    time.sleep(1)

count = 0
executor = ThreadPoolExecutor(max_workers=2)
while count < 5:
    print("Starting work")
    executor.submit(do_work(count))
    print("Work submitted")
    count += 1

Fixed this line to make it go parallel. 修复了这条线使其平行。

    executor.submit(do_work, count)

You were telling Python to execute the function do_work() , and to then pass whatever that function returned, to executor.do_work() : 您是在告诉Python执行函数do_work() ,然后将返回的任何函数传递给executor.do_work()

executor.submit(do_work(count))

It might be easier for you to see this if you used a variable to hold the result of do_work() . 如果使用变量保存do_work()的结果,则可能更容易看到这一点。 The following is functionally equivalent to the above: 以下功能与上述功能等效:

do_work_result = do_work(count)
executor.submit(do_work_result)

In Python, functions are first-class objects; 在Python中,函数是一流的对象。 using just the name do_work you are referencing the function object. 仅使用名称do_work即可引用函数对象。 Only adding (...) to an expression that produces a function object (or another callable object type) causes something to be executed. 仅在产生函数对象(或其他可调用对象类型)的表达式中添加(...)才能执行某些操作。

In the form 形式

executor.submit(do_work, count)

you do not call the function. 调用该函数。 You are passing in the function object itself as the first argument, and count as the second argument. 您传递函数对象本身作为第一个参数,并count作为第二个参数。 The executor.submit() function accepts callable objects and their arguments to then later on run those functions in parallel, with the arguments provided. executor.submit()函数接受可调用对象及其参数,然后稍后使用提供的参数并行运行这些函数。

This allows the ThreadPoolExecutor to take that function reference and the single argument and only call the function in a new thread, later on . 这允许ThreadPoolExecutor采取这种函数引用和唯一的参数,只调用一个新的线程功能, 稍后

Because you were calling the function first, you had to wait for each function to complete first as you called it sequentially before adding. 因为您是先调用该函数,所以在添加之前,必须依次等待每个函数,然后等待每个函数先完成。 And because the functions return None , you were adding those None references to executor.submit() , and would have seen a TypeError exception later on to tell you that 'NoneType' object is not callable . 而且由于函数返回None ,你将那些None以引用executor.submit()并且会看到一个TypeError例外后来就告诉你, 'NoneType' object is not callable That happens because the threadpool executor tried to use None() , which doesn't work because indeed, None is not a callable. 之所以发生这种情况,是因为线程池执行程序尝试使用None() ,因为它实际上不是None不可调用的,所以它不起作用。

Under the hood, the library essentially does this: 在后台,该库实际上是这样做的:

def submit(self, fn, *args, **kwargs):
    # record the function to be called as a work item, with other information
    w = _WorkItem(..., fn, args, kwargs)
    self._work_queue.put(w)

so a work item referencing the function and arguments is added to a queue. 因此,引用该函数和参数的工作项将添加到队列中。 Worker threads are created which take items from the queue again it is taken from the queue (in another thread, or a child process), the _WorkItem.run() method is called, which runs your function: 创建工作线程,该工作线程再次从队列中取出项目(从另一个线程或子进程中_WorkItem.run() ), _WorkItem.run()方法,该函数运行您的函数:

result = self.fn(*self.args, **self.kwargs)

Only then the (...) call syntax is used. 只有这样 ,才使用(...)调用语法。 Because there are multiple threads, the code is executed concurrently. 因为有多个线程,所以代码是同时执行的。

You do want to read up on how pure Python code can't run in parallel , only concurrently: Does Python support multithreading? 您确实想了解纯Python代码如何不能并行运行 ,而只能并行运行: Python是否支持多线程? Can it speed up execution time? 可以加快执行时间吗?

Your do_work() functions only run 'faster' because time.sleep() doesn't have to do any actual work, apart from telling the kernel to not give any execution time to the thread the sleep was executed on, for the requested amount of time. 您的do_work()函数只能“更快地”运行,因为time.sleep()不需要执行任何实际工作,除了告诉内核不要为执行睡眠的线程提供任何执行时间外,还可以请求数量时间。 You end up with a bunch of threads all asleep. 您最终会睡着一堆线程。 If your workers had to execute Python instructions, then the total time spent on running these functions concurrently or sequentially would not differ all that much. 如果您的工作人员必须执行Python指令,那么同时或顺序运行这些功能所花费的总时间不会有太大的不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM