简体   繁体   English

Python:使用线程来调用subprocess.Popen多次

[英]Python: using threads to call subprocess.Popen multiple times

I have a service that is running (Twisted jsonrpc server). 我有一个正在运行的服务(Twisted jsonrpc服务器)。 When I make a call to "run_procs" the service will look at a bunch of objects and inspect their timestamp property to see if they should run. 当我调用“ run_procs”时,服务将查看一堆对象并检查其timestamp属性以查看它们是否应运行。 If they should, they get added to a thread_pool (list) and then every item in the thread_pool gets the start() method called. 如果需要,则将它们添加到thread_pool(列表)中,然后thread_pool中的每个项目都将调用start()方法。

I have used this setup for several other applications where I wanted to run a function within my class with theading. 我将这个设置用于其他几个应用程序,这些应用程序都希望在我的课程中通过theading运行一个函数。 However, when I am using a subprocess.Popen call in the function called by each thread, the calls run one-at-a-time instead of running concurrently like I would expect. 但是,当我在每个线程调用的函数中使用subprocess.Popen调用时,这些调用一次运行一次,而不是像我期望的那样同时运行。

Here is some sample code: 这是一些示例代码:

class ProcService(jsonrpc.JSONRPC):
        self.thread_pool = []
        self.running_threads = []
        self.lock = threading.Lock()

        def clean_pool(self, thread_pool, join=False):
                for th in [x for x in thread_pool if not x.isAlive()]:
                        if join: th.join()
                        thread_pool.remove(th)
                        del th
                return thread_pool

        def run_threads(self, parallel=10):
                while len(self.running_threads)+len(self.thread_pool) > 0:
                        self.clean_pool(self.running_threads, join=True)
                        n = min(max(parallel - len(self.running_threads), 0), len(self.thread_pool))
                        if n > 0:
                                for th in self.thread_pool[0:n]: th.start()
                                self.running_threads.extend(self.thread_pool[0:n])
                                del self.thread_pool[0:n]
                        time.sleep(.01)
                for th in self.running_threads+self.thread_pool: th.join()

        def jsonrpc_run_procs(self):
                for i, item in enumerate(self.items):
                        if item.should_run():
                                self.thread_pool.append(threading.Thread(target=self.run_proc, args=tuple([item])))
                self.run_threads(5)

        def run_proc(self, proc):
                self.lock.acquire()
                print "\nSubprocess started"
                p = subprocess.Popen('%s/program_to_run.py %s' %(os.getcwd(), proc.data), shell=True, stdin=subprocess.PIPE, stdout=subprocess.PIPE,)
                stdout_value = proc.communicate('through stdin to stdout')[0]
                self.lock.release()

Any help/suggestions are appreciated. 任何帮助/建议表示赞赏。

* EDIT * OK. *编辑*确定。 So now I want to read back the output from the stdout pipe. 所以现在我想回读标准输出管道的输出。 This works some of the time, but also fails with select.error: (4, 'Interrupted system call') I assume this is because sometimes the process has already terminated before I try to run the communicate method. 这有时会起作用,但也会因select.error而失败:(4,“系统调用中断”)我认为这是因为有时在我尝试运行通讯方法之前,该进程已经终止。 the code in the run_proc method has been changed to: run_proc方法中的代码已更改为:

def run_proc(self, proc): self.lock.acquire() p = subprocess.Popen( #etc self.running_procs.append([p, proc.data.id]) self.lock.release() def run_proc(self,proc):self.lock.acquire()p =子进程.Popen(#etc self.running_procs.append([p,proc.data.id])self.lock.release()

after I call self.run_threads(5) I call self.check_procs() 在我调用self.run_threads(5)之后,我调用了self.check_procs()

check_procs method iterates the list of running_procs to check for poll() is not None. check_procs方法迭代running_procs的列表以检查poll()是否为None。 How can I get output from pipe? 如何从管道获得输出? I have tried both of the following 我已经尝试了以下两个

calling check_procs once:

def check_procs(self):
    for proc_details in self.running_procs:
        proc = proc_details[0]
        while (proc.poll() == None):
            time.sleep(0.1)
        stdout_value = proc.communicate('through stdin to stdout')[0]
        self.running_procs.remove(proc_details)
        print proc_details[1], stdout_value
        del proc_details

calling check_procs in while loop like:

while len(self.running_procs) > 0:
    self.check_procs()

def check_procs(self):
    for proc_details in self.running_procs:
        if (proc.poll() is not None):
            stdout_value = proc.communicate('through stdin to stdout')[0]
            self.running_procs.remove(proc_details)
            print proc_details[1], stdout_value
            del proc_details

I think the key code is: 我认为关键代码是:

    self.lock.acquire()
    print "\nSubprocess started"
    p = subprocess.Popen( # etc
    stdout_value = proc.communicate('through stdin to stdout')[0]
    self.lock.release()

the explicit calls to acquire and release should guarantee serialization -- don't you observe serialization just as invariably if you do other things in this block instead of the subprocess use? 对获取和释放的显式调用应保证序列化-如果您在此块中执行其他操作而不是子流程使用,您是否同样会观察到序列化?

Edit : all silence here, so I'll add the suggestion to remove the locking and instead put each stdout_value on a Queue.Queue() instance -- Queue is intrinsicaly threadsafe (deals with its own locking) so you can get (or get_nowait , etc etc) results from it once they're ready and have been put there. 编辑 :所有沉默在这里,所以我将建议删除锁定,而是将每个stdout_value放在Queue.Queue()实例上– Queue本质上是线程安全的(使用自己的锁定进行处理),因此您可以get (或get_nowait ,等等等等),从它的结果,一旦他们已经准备好并已put那里。 In general, Queue is the best way to arrange thread communication (and often synchronization too) in Python, any time it can be feasibly arranged to do things that way. 通常,在任何可行的情况下, Queue是在Python中安排线程通信(通常也包括同步)的最佳方法。

Specifically: add import Queue at the start; 具体来说:在开始时添加import Queue give up making, acquiring and releasing self.lock (just delete those three lines); 放弃制作,获取和释放self.lock (只需删除这三行); add self.q = Queue.Queue() to the __init__ ; self.q = Queue.Queue()添加到__init__ ; right after the call stdout_value = proc.communicate(... add one statement self.q.put(stdout_value) ; now eg finish the jsonrpc_run_procs method with 在调用stdout_value = proc.communicate(...添加一条语句self.q.put(stdout_value) ;现在,例如,使用以下命令完成jsonrpc_run_procs方法:

while not self.q.empty():
  result = self.q.get()
  print 'One result is %r' % result

to confirm that all the results are there. 确认所有结果都在那里。 (Normally the empty method of queues is not reliable, but in this case all threads putting to the queue are already finished, so you should be fine). (通常,队列的empty方法是不可靠的,但是在这种情况下,放入队列的所有线程都已经完成,因此应该可以)。

Your specific problem is probably caused by the line stdout_value = proc.communicate('through stdin to stdout')[0] . 您的特定问题可能是由于stdout_value = proc.communicate('through stdin to stdout')[0]这行引起的。 Subprocess.communicate will "Wait for process to terminate" , which, when used with a lock, will run one at a time. Subprocess.communicate将“等待进程终止” ,与锁一起使用时,它将一次运行一个。

What you can do is simply add the p variable to a list and run and use the Subprocess API to wait for the subprocesses to finish. 您只需将p变量添加到列表中,然后运行并使用Subprocess API来等待子流程完成。 Periodically poll each subprocess in your main thread. 定期轮询主线程中的每个子进程。

On second look, it looks like you may have an issue on this line as well: for th in self.running_threads+self.thread_pool: th.join() . 从第二个角度看,您似乎在这条线上也有问题for th in self.running_threads+self.thread_pool: th.join() Thread.join() is another method that will wait for the thread to finish. Thread.join()是另一个将等待线程完成的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM