简体   繁体   English

如何在连续循环中使用python多处理池

[英]How to use python multiprocessing pool in continuous loop

I am using python multiprocessing library for executing a selenium script. 我正在使用python multiprocessing库执行Selenium脚本。 My code is below : 我的代码如下:

#-- start and join multiple threads ---
thread_list = []
total_threads=10 #-- no of parallel threads
for i in range(total_threads):
    t = Process(target=get_browser_and_start, args=[url,nlp,pixel])
    thread_list.append(t)
    print "starting thread..."
    t.start()

for t in thread_list:
    print "joining existing thread..."
    t.join()

As I understood the join() function, it will wait for each process to complete. 据我了解join()函数,它将等待每个进程完成。 But I want that as soon as a process is released, it will be assigned another task to perform new function. 但是我希望一旦一个进程发布,它将被分配另一个任务来执行新功能。

It can be understood like this: 可以这样理解:

Say 8 processes started in first instance. 假设有8个进程从一开始就启动。

no_of_tasks_to_perform = 100

for i in range(no_of_tasks_to_perform):
    processes start(8)
    if process no 2 finished executing, start new process
    maintain 8 process at any point of time till 
    "i" is <= no_of_tasks_to_perform

Instead of starting new processes every now and then, try to put all your tasks into a multiprocessing.Queue() , and start 8 long-running processes , in each process keep accessing the task queue to get new tasks and then do the job, until there's no task any more. 与其时不时启动新进程,不如尝试将所有任务放入multiprocessing.Queue()并启动8个长期运行的进程 ,在每个进程中继续访问任务队列以获取新任务,然后执行任务,直到没有任务了。

In your case, it's more like this: 在您的情况下,它更像是这样:

from multiprocessing import Queue, Process

def worker(queue):
    while not queue.empty():
        task = queue.get()

        # now start to work on your task
        get_browser_and_start(url,nlp,pixel) # url, nlp, pixel can be unpacked from task

def main():
    queue = Queue()

    # Now put tasks into queue
    no_of_tasks_to_perform = 100

    for i in range(no_of_tasks_to_perform):
        queue.put([url, nlp, pixel, ...]) 

    # Now start all processes
    process = Process(target=worker, args=(queue, ))
    process.start()
    ...
    process.join()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM