為什么我必須使用.wait（）和python的子進程模塊？

Question

我正在通過Linux上的Python子進程模塊運行Perl腳本。 使用變量輸入多次調用運行腳本的函數。

def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)

但是，如果我運行此函數，例如兩次，則第二個進程啟動時第一個進程的執行將停止。 我可以通過添加來獲得我想要的行為

process.wait()

在調用腳本之后，所以我並沒有真正陷入困境。 但是，我想找出為什么我不能使用子進程多次運行腳本，並讓腳本並行進行這些計算，而不必等待它在每次運行之間完成。

UPDATE

罪魁禍首並不那么令人興奮：perl腳本使用了為每次執行重寫的公共文件。

但是，我從中學到的教訓是，垃圾收集器在開始運行后不會刪除該進程，因為一旦我將其整理出來，這對我的腳本沒有任何影響。

Answer 1

如果你正在使用Unix，並希望在后台運行許多進程，你可以這樣使用subprocess.Popen ：

x_fork_many.py：

import subprocess
import os
import sys
import time
import random
import gc  # This is just to test the hypothesis that garbage collection of p=Popen() causing the problem.

# This spawns many (3) children in quick succession
# and then reports as each child finishes.
if __name__=='__main__':
    N=3
    if len(sys.argv)>1:
        x=random.randint(1,10)
        print('{p} sleeping for {x} sec'.format(p=os.getpid(),x=x))
        time.sleep(x)
    else:
        for script in xrange(N): 
            args=['test.py','sleep'] 
            p = subprocess.Popen(args)
        gc.collect()
        for i in range(N):
            pid,retval=os.wait()
            print('{p} finished'.format(p=pid))

輸出看起來像這樣：

% x_fork_many.py 
15562 sleeping for 10 sec
15563 sleeping for 5 sec
15564 sleeping for 6 sec
15563 finished
15564 finished
15562 finished

我不確定為什么你在不調用.wait()時會遇到奇怪的行為。 但是，上面的腳本建議（至少在unix上）不需要在列表或集合中保存subprocess.Popen(...)進程。 無論問題是什么，我認為它與垃圾收集無關。

PS。 也許你的perl腳本在某種程度上是沖突的，這導致一個在另一個腳本運行時以錯誤結束。 您是否嘗試從命令行啟動對perl腳本的多次調用？

Answer 2

你必須調用wait（）才能要求“等待”popen的結尾。

由於popen在后台執行perl腳本，如果你不等待（），它將停止在對象“進程”的生命周期結束時......那就是在script_runner的末尾。

Answer 3

正如ericdupo所說，任務被殺死是因為你用一個新的Popen對象覆蓋你的process變量，並且由於沒有更多對你之前的Popen對象的引用，它被垃圾收集器破壞了。 您可以通過在某處保留對對象的引用來阻止這種情況，例如列表：

processes = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    processes.append(process)

這應該足以防止您以前的Popen對象被破壞

Answer 4

我想你想做

list_process = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    list_process.append(process)
#call several times script_runner
for process in list_process:
    process.wait()

所以你的過程將並行運行

為什么我必須使用.wait（）和python的子進程模塊？

問題描述

4 個解決方案

解決方案1
2 已采納 2010-11-12 14:26:17

解決方案2
1 2010-11-12 13:51:53

解決方案3
1 2010-11-12 14:41:20

解決方案4
0 2010-11-12 15:47:17

為什么我必須使用.wait（）和python的子進程模塊？

問題描述

4 個解決方案

解決方案1 2 已采納 2010-11-12 14:26:17

解決方案2 1 2010-11-12 13:51:53

解決方案3 1 2010-11-12 14:41:20

解決方案4 0 2010-11-12 15:47:17

解決方案1
2 已采納 2010-11-12 14:26:17

解決方案2
1 2010-11-12 13:51:53

解決方案3
1 2010-11-12 14:41:20

解決方案4
0 2010-11-12 15:47:17