[英]Does using the subprocess module release the python GIL?
When calling a linux binary which takes a relatively long time through Python's subprocess
module, does this release the GIL? 当通过Python的
subprocess
模块调用需要相对较长时间的linux二进制文件时,这会释放GIL吗?
I want to parallelise some code which calls a binary program from the command line. 我想并行化一些从命令行调用二进制程序的代码。 Is it better to use threads (through
threading
and a multiprocessing.pool.ThreadPool
) or multiprocessing
? 使用线程(通过
threading
和multiprocessing.pool.ThreadPool
)或多multiprocessing
是否更好? My assumption is that if subprocess
releases the GIL then choosing the threading
option is better. 我的假设是,如果
subprocess
释放GIL,那么选择threading
选项会更好。
When calling a linux binary which takes a relatively long time through Python's
subprocess
module, does this release the GIL?当通过Python的
subprocess
模块调用需要相对较长时间的linux二进制文件时,这会释放GIL吗?
Yes, it releases the Global Interpreter Lock (GIL) in the calling process. 是的,它在调用过程中释放全局解释器锁(GIL) 。
As you are likely aware, on POSIX platforms subprocess
offers convenience interfaces atop the "raw" components from fork
, execve
, and waitpid
. 您可能知道,在POSIX平台上,
subprocess
在fork
, execve
和waitpid
的“原始”组件上提供了便利接口。
By inspection of the CPython 2.7.9 sources, fork
and execve
do not release the GIL. 通过检查CPython 2.7.9源代码,
fork
和execve
不会释放GIL。 However, those calls do not block, so we'd not expect the GIL to be released. 但是,这些调用不会阻塞,因此我们不希望释放GIL。
waitpid
of course does block, but we see it's implementation does give up the GIL using the ALLOW_THREADS macros: waitpid
当然会阻塞,但我们看到它的实现确实使用ALLOW_THREADS宏放弃了GIL:
static PyObject *
posix_waitpid(PyObject *self, PyObject *args)
{
....
Py_BEGIN_ALLOW_THREADS
pid = waitpid(pid, &status, options);
Py_END_ALLOW_THREADS
....
This could also be tested by calling out to some long running program like sleep from a demonstration multithreaded python script. 这也可以通过呼唤像一些长期运行的程序进行测试睡眠从演示多线程python脚本。
GIL doesn't span multiple processes. GIL不跨越多个进程。
subprocess.Popen
starts a new process. subprocess.Popen
启动一个新进程。 If it starts a Python process then it will have its own GIL. 如果它启动Python进程,那么它将拥有自己的GIL。
You don't need multiple threads (or processes created by multiprocessing
) if all you want is to run some linux binaries in parallel: 你不需要多线程(或进程创建
multiprocessing
),如果你想要的是并行运行一些Linux程序:
from subprocess import Popen
# start all processes
processes = [Popen(['program', str(i)]) for i in range(10)]
# now all processes run in parallel
# wait for processes to complete
for p in processes:
p.wait()
You could use multiprocessing.ThreadPool
to limit number of concurrently run programs . 您可以使用
multiprocessing.ThreadPool
来限制并发运行的程序的数量 。
Since subprocess
is for running executable (it is essentially a wrapper around os.fork()
and os.execve()
), it probably makes more sense to use it. 由于
subprocess
是运行可执行文件(它本质上是围绕一个包装os.fork()
和os.execve()
它可能使得使用它更有意义。 You can use subprocess.Popen
. 您可以使用
subprocess.Popen
。 Something like: 就像是:
import subprocess
process = subprocess.Popen(["binary"])
This will run in as a separate process, hence not being affected by the GIL. 这将作为一个单独的进程运行,因此不受GIL的影响。 You can then use the
Popen.poll()
method to check if child process has terminated: 然后,您可以使用
Popen.poll()
方法检查子进程是否已终止:
if process.poll():
# process has finished its work
returncode = process.returncode
Just need to make sure you don't call any of the methods that wait for the process to finish its work (eg Popen.communicate() ) to avoid your Python script blocking. 只需要确保不要调用任何等待进程完成其工作的方法(例如Popen.communicate() )以避免Python脚本阻塞。
As mentioned in this answer 正如这个答案所述
multiprocessing
is for running functions within your existing (Python) code with support for more flexible communications among the family of processes.multiprocessing
用于在现有(Python)代码中运行功能,支持流程系列之间更灵活的通信。multiprocessing
module is intended to provide interfaces and features which are very similar to threading while allowing CPython to scale your processing among multiple CPUs/cores despite the GIL.multiprocessing
模块旨在提供与线程非常相似的接口和功能,同时允许CPython在多个CPU /核心之间扩展处理,尽管GIL。
So, given your use-case, subprocess
seems to be the right choice. 因此,考虑到您的用例,
subprocess
似乎是正确的选择。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.