[英]Python multiprocess Process never terminates
我下面的例程獲取urllib2.Requests的列表,並為每個請求生成一個新進程並將其觸發。 目的是為了實現異步速度,因此一勞永逸(無需響應)。 問題在於,以下代碼中產生的進程永遠不會終止。 因此,經過幾次這些操作之后,OOM就會消失了。 上下文:Django Web應用程序。 有什么幫助嗎?
MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
MPQ = multiprocessing.JoinableQueue(MP_CONCURRENT)
def request_manager(req_list):
try:
# put request list in the queue
for req in req_list:
MPQ.put(req)
# call processes on queue
worker = multiprocessing.Process(target=process_request, args=(MPQ,))
worker.daemon = True
worker.start()
# move on after queue is empty
MPQ.join()
except Exception, e:
logging.error(traceback.print_exc())
# prcoess requests in queue
def process_request(MPQ):
try:
while True:
req = MPQ.get()
dr = urllib2.urlopen(req)
MPQ.task_done()
except Exception, e:
logging.error(traceback.print_exc())
也許我說的不對,但是
MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
MPQ = multiprocessing.JoinableQueue(MP_CONCURRENT)
def request_manager(req_list):
try:
# put request list in the queue
pool=[]
for req in req_list:
MPQ.put(req)
# call processes on queue
worker = multiprocessing.Process(target=process_request, args=(MPQ,))
worker.daemon = True
worker.start()
pool.append(worker)
# move on after queue is empty
MPQ.join()
# Close not needed processes
for p in pool: p.terminate()
except Exception, e:
logging.error(traceback.print_exc())
# prcoess requests in queue
def process_request(MPQ):
try:
while True:
req = MPQ.get()
dr = urllib2.urlopen(req)
MPQ.task_done()
except Exception, e:
logging.error(traceback.print_exc())
MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
MPQ = multiprocessing.JoinableQueue(MP_CONCURRENT)
CHUNK_SIZE = 20 #number of requests sended to one process.
pool = multiprocessing.Pool(MP_CONCURRENT)
def request_manager(req_list):
try:
# put request list in the queue
responce=pool.map(process_request,req_list,CHUNK_SIZE) # function exits after all requests called and pool work ended
# OR
responce=pool.map_async(process_request,req_list,CHUNK_SIZE) #function request_manager exits after all requests passed to pool
except Exception, e:
logging.error(traceback.print_exc())
# prcoess requests in queue
def process_request(req):
dr = urllib2.urlopen(req)
這比您的代碼快約5-10倍
將側面“ brocker”集成到django中(例如rabbitmq
或類似的東西)。
好吧,經過一番擺弄(和一夜安眠)后,我相信我已經解決了問題(並感謝Eri,您是我所需要的靈感)。 僵屍進程的主要問題是,我沒有發信號說進程已完成(並殺死進程),而我(天真)認為這都是多進程自動發生的。
起作用的代碼:
# function that will be run through the pool
def process_request(req):
try:
dr = urllib2.urlopen(req, timeout=30)
except Exception, e:
logging.error(traceback.print_exc())
# process killer
def sig_end(r):
sys.exit()
# globals
MP_CONCURRENT = int(multiprocessing.cpu_count()) * 2
if MP_CONCURRENT < 2: MP_CONCURRENT = 2
CHUNK_SIZE = 20
POOL = multiprocessing.Pool(MP_CONCURRENT)
# pool initiator
def request_manager(req_list):
try:
resp = POOL.map_async(process_request, req_list, CHUNK_SIZE, callback=sig_end)
except Exception, e:
logging.error(traceback.print_exc())
一些注意事項:
1)必須先定義“ map_async”(在此示例中為“ process_request”)所命中的函數(並在全局聲明之前)。
2)可能有一種更優雅的方式退出此過程(歡迎提出建議)。
3)由於“回調”功能使我可以立即發出信號,因此在此示例中使用pool確實是最好的(再次感謝Eri)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.