简体   繁体   English

Python多处理:只有一个进程正在运行

[英]Python Multiprocessing: Only one process is running

I am trying to spawn multiple parallel processes using the Python multiprocessing module. 我正在尝试使用Python多处理模块生成多个并行进程。 Basically, I did something like 基本上,我做了类似的事情

pool = Pool(30)
results = [pool.apply_async(foo, (trainData, featureVector, terms, selLabel)) for selLabel in selLabels]
for r in results:
    tmp = r.get()
    modelFiles[tmp[0]] = tmp[1]

30 processes are spawned, however, it seems most of the processes have been put into sleep while only one process is actually running. 产生了30个进程,但是,似乎大多数进程已进入休眠状态,而实际只有一个进程正在运行。 Below is what I get from ps: 以下是我从ps得到的:

PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

31267 74.6  2.4 7125412 6360080 pts/1 Sl+  13:06  24:25  \_ python2.6 /home/PerlModules/Python/DoOVA.py

31427 27.4  2.3 6528532 6120904 pts/1 R+   13:20   5:18      \_ python2.6 /home/PerlModules/Python/DoOVA.py

31428  0.0  1.3 4024724 3617016 pts/1 S+   13:20   0:00      \_ python2.6 /home/PerlModules/Python/DoOVA.py

31429  0.0  1.3 4024724 3617016 pts/1 S+   13:20   0:00      \_ python2.6 /home/PerlModules/Python/DoOVA.py

31430  0.0  1.3 4024724 3617016 pts/1 S+   13:20   0:00      \_ python2.6 /home/PerlModules/Python/DoOVA.py

DoOVA.py is the script I am running. DoOVA.py是我正在运行的脚本。 Most of them have a status S+ . 他们中的大多数人的身份都是S+

Could anyone give me some clue about what's the problem? 谁能给我一些关于问题的线索? I know the input arguement featureVector is pretty large in size, say around 300MB. 我知道输入争论featureVector大小非常大,比如大约300MB。 Would that be a problem? 那会是个问题吗? The machine I am running on have several TB of memory though. 我运行的机器有几TB的内存。

foo does something like: foo做的事情如下:

def foo(trainData, featureVector, terms, selLabel, penalty):
    outputFile = 'train_'+selLabel+'.dat'
    annotation = dict()
    for id in trainData:
        if trainData[id] == selLabel:
            annotation[id] = '1'
        else:
            annotation[id] = '-1'
    try:
        os.mkdir(selLabel)
        os.chdir(selLabel)
    except OSError:
        os.chdir(selLabel)
    ###Some more functions, which involves a command line call through call from subprocess module
    os.chdir('../')
    return (selLabel, 'SVM_' + selLabel + '.model')

All other input arguments are small in size. 所有其他输入参数的大小都很小。 And the machine has at least 100 cpus. 机器至少有100 cpus。 In every run, it takes the script a long time even before any directory was created, although there is no significant computation happened in foo before os.mkdir() 在每次运行中,即使在创建任何目录之前,脚本也需要很长时间,尽管在os.mkdir()之前foo中没有发生重大计算。

As the comments indicate you want to pass featureVector using the initializer and initargs arguments to Pool . 正如评论指出要通过featureVector使用initializerinitargs参数Pool On Unix-type systems this will result in a massive performance increase (even if there's only 1 item in selLabels ) because the value will passed to the child process essentially for free using os.fork . 在Unix类型的系统上,这将导致大量的性能提升(即使selLabels只有1个项目),因为该值将使用os.fork基本上免费传递给子进程。 Otherwise each time foo is called, featureVector will get pickled by the parent process, passed through a pipe and unpickled by the child process. 否则,每次调用foofeatureVector都将被父进程pickle,通过管道传递并由子进程进行unpickled。 This will take a long time, and will essentially serialize all the child processes since they'll be waiting for the parent process to pickle and send a copy of featureVector for each call, one by one. 这将花费很长时间,并且基本上将序列化所有子进程,因为它们将等待父进程腌制并featureVector发送每个调用的featureVector副本。

Since there's some confusion about what I'm talking about above, here's a bit longer explanation of what's happening in your code as its currently written: 由于对于我上面谈论的内容存在一些困惑,所以这里有一个更长的解释,说明代码中发生的内容与当前编写的内容有关:

When you create the Pool object, immediately 30 worker processes are created, all children of the main process which created the Pool object. 创建Pool对象时,将立即创建30个工作进程,主进程的所有子进程都创建了Pool对象。 In order to communitcate with each child process a pipe is a created. 为了与每个子进程进行通信,创建了一个管道。 This pipe allows two way communication between the parent process and the child processes. 此管道允许父进程和子进程之间的双向通信。 The parent uses the pipe to instruct the child process what to do, and the children uses the pipe to notify the parent of the result of any operations. 父级使用管道来指示子进程执行的操作,子级使用管道通知父级任何操作的结果。

When you first call pool.apply_async the parent process sends a command through the pipe instructing a child process to execute the foo function using the arguments supplied. 当您第一次调用pool.apply_async ,父进程通过管道发送命令,指示子进程使用提供的参数执行foo函数。 Since one of the arguments is huge, 300MB, this ends up taking a very long time. 由于其中一个论点是巨大的,300MB,这最终需要很长时间。 The parent process has to pickle the object. 父进程必须pickle对象。 This converts the object (and everything it references) into a byte stream that can be sent through a pipe. 这会将对象(及其引用的所有内容)转换为可通过管道发送的字节流。

Since the pipe can only hold about 64k (Linux default), and you're sending much more than that, this effectively synchronizes the parent and one of the child processes. 由于管道只能容纳大约64k(Linux默认值),并且您发送的内容远不止这些,因此可以有效地同步父进程和其中一个子进程。 The parent process can only send the arguments as fast the child process can receive and unpickle them, and child process can only receive the arguments as fast as the parent pickle and send them. 父进程只能以子进程可以接收和取消它们的速度发送参数,并且子进程只能像父进程一样快地接收参数并发送它们。 While this is going on all the other child process have to wait. 虽然这是在进行所有其他子进程必须等待。 The parent process can only send a command to one child process a time. 父进程一次只能向一个子进程发送命令。

Once the parent process has finished sending all the arguments for the first call of foo , it can then move on to sending a command to call foo for a second time. 一旦父进程完成了第一次调用foo所有参数的发送,它就可以继续发送命令再次调用foo Very soon after that, once the child process has finished receiving all the arguments, the child will call foo . 在此之后不久,一旦子进程收到所有参数,孩子就会调用foo (This is why it takes a long time before any directory is created, it takes a long time before foo is even called.) After foo returns the child process will then wait for the parent process to send another command. (这就是为什么在创建任何目录之前需要很长时间,甚至在foo被调用之前需要很长时间。)在foo返回之后,子进程将等待父进程发送另一个命令。 If foo itself takes a short enough amount of time to execute, it's possible that same child process that received the first command to call foo will also receive the second command to call foo . 如果foo本身需要足够短的时间来执行,那么接收第一个命令来调用foo同一子进程也可能会收到第二个调用foo命令。

Unless foo itself takes a long time to execute, as long or longer than it takes to send featureVector over the pipe, then you'll be effectively limited to just one child process executing at time. 除非foo本身需要很长时间才能执行,只要比通过管道发送featureVector花费的时间长或长,那么你将被有效地限制为只执行一个子进程。 The parent process will be trying to command the child processes to call foo as fast as it can, but because featureVector is so big, it can only do so at a very slow rate. 父进程将尝试命令子进程尽可能快地调用foo ,但由于featureVector太大,它只能以非常慢的速率执行。 Once it's done sending the command to one process to call foo , the previous process it commanded to call foo will already finished calling foo long ago. 一旦完成将命令发送到一个进程来调用foo ,它命令调用foo的前一个进程很久以前就已经完成了对foo调用。 There will be little or no overlap between running child processes. 运行子进程之间几乎没有重叠。

In order to fixed the performance problem in your code you'll want to do something like this: 为了解决代码中的性能问题,您需要执行以下操作:

def child_initialize(_trainData, _featureVector, _terms):
     global trainData, featureVector, terms
     trainData = _trainData
     featureVector = _featureVector
     terms = _terms

def foo(selLabel):
     ...

pool = Pool(30, initialize = child_initialize, initargs = (trainData, featureVector, terms))
results = [pool.apply_async(foo, (selLabel,)) for selLabel in selLabels]

This code also passes trainData and term using initargs on the assumption they don't change either. 此代码还使用initargs传递trainDataterm ,假设它们也不会更改。

While this should result in a huge performance improvement, and allow the child processes to run in parallel, it's unlikely to mean that that child processes will show up in ps in the runable state all that more often. 虽然这应该会带来巨大的性能提升,并允许子进程并行运行,但这并不一定意味着子进程将以更常见的状态出现在可运行状态的ps中。 Your example foo function looks like it'll be spending most of its time waiting for the "command line call" to finish. 您的示例foo函数看起来似乎将花费大部分时间等待“命令行调用”完成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM