简体   繁体   中英

Python multiprocessing.pool failed to stop after finishing all the tasks

I have implemented a parser like this,

import multiprocessing
import time

def foo(i):
    try:
        # some codes
    except Exception, e:
        print e

def worker(i):
    foo(i)
    time.sleep(i)
    return i


if __name__ == "__main__":
    pool = multiprocessing.Pool(processes=4)
    result = pool.map_async(worker, range(15))
    while not result.ready():
        print("num left: {}".format(result._number_left))
        time.sleep(1)
    real_result = result.get()
    pool.close()
    pool.join()

My parser actually finishes all the processes but the results are not available ie, it's still inside the while loop and printing num left : 2 . How I stop this? And I don't want the value of real_result variable.

I'm running Ubuntu 14.04, python 2.7

Corresponding part of my code looks like,

    async_args = ((date, kw_dict) for date in dates)
    pool = Pool(processes=4)
    no_rec = []

    def check_for_exit(msg):
        print msg
        if last_date in msg:
            print 'Terminating the pool'
            pool.terminate()
    try:
        result = pool.map_async(parse_date_range, async_args)
        while not result.ready():
            print("num left: {}".format(result._number_left))
            sleep(1)

        real_result = result.get(5)

        passed_dates = []

        for x, y in real_result:
            passed_dates.append(x)
            if y:
                no_rec.append(y[0])

        # if last_date in passed_dates:
        #     print 'Terminating the pool'
        #     pool.terminate()

        pool.close()
    except:

        print 'Pool error'
        pool.terminate()
        print traceback.format_exc()
    finally:
        pool.join()

My bet is that you have faulty parse_date_range , which causes a worker process to terminate without producing any result or py exception. Probably libc's exit is called by a C module/lib due to a realy nasty error.
This code reproduces the infinite loop you observe:

import sys
import multiprocessing
import time

def parse_date_range(i):
    if i == 5:
        sys.exit(1) # or raise SystemExit; 
                    # other exceptions are handled by the pool
    time.sleep(i/19.)
    return i


if __name__ == "__main__":
    pool = multiprocessing.Pool(4)
    result = pool.map_async(parse_date_range, range(15))
    while not result.ready():
        print("num left: {}".format(result._number_left))
        time.sleep(1)
    real_result = result.get()
    pool.close()
    pool.join()

Hope this'll help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM