Why does my multiprocessing code take so long to complete in python 3.9, but not python 2.7. Code improvements?

Question

I have some code that uses multiprocessing to perform some work with apply_async and while it is working, I update the main GUI and allow other activities to be performed. Everything seems to work just fine in python 2.7, however, I am running into issues running the code in python 3.9. My overall issue is that it is just not working any more, but in putting together the sample debug code below (which does work) I have noticed a significant increase in the amount of time it takes for my process to complete in 3.9 vs 2.7.

Simplified code is as follows:

import multiprocessing
import time
import datetime


def main():
    start_time = datetime.datetime.now()
    print('Spinning up pool')
    pool = multiprocessing.Pool(processes=10)
    vals = range(100)
    results = []
    print('Adding processes')
    runs = [pool.apply_async(calc, (x, 1), callback=results.append) for x in vals]

    print('Working...')
    while len(vals) != len(results):
        print('Results: {}'.format(results))
        time.sleep(1)

    pool.close()
    pool.join()
    print('Done')
    end_time = datetime.datetime.now()
    duration = end_time - start_time
    print('Program took {} seconds to complete'.format(duration.total_seconds()))

def calc(x, y):
    print(x + y)
    time.sleep(2)
    return(x+y)

if __name__ == "__main__":
    main()

python 2.7:

Program took 48.965 seconds to complete

python 3.9:

Program took 372.522254 seconds to complete

Is there a reason this takes so much longer in 3.9 vs 2.7? Is there any modifications to my code to speed things up a bit? Is there a better way to process tasks like this while waiting for a pool to finish up all the work?

Operating system is Windows 10.

Answer 1

I could not find a significant difference in running times between Python 3.8 and Python 2.7 on Windows. But the way I would approach solving the problem is by noting that your worker function calc is mostly wait time and a small amount of CPU-required calculations. So one thing you could do is to increase your pool size significantly since your processes will be in a wait state most of the time.

But better would be to use a thread pool of size 100 (since you are submitting 100 tasks) and pass to your worker function a multiprocessing pool (whose size is the number of cores you have) that can be used for performing the calculations that require actual CPU. And since it appears that your results list will contain return values from your worker function in completion order , you might as well use the multiprocessing.pool.ThreadPool.imap_unordered method rather than multiple calls to apply_async :

import multiprocessing
import time
import datetime
from functools import partial


def main():
    start_time = datetime.datetime.now()
    pool = multiprocessing.Pool()
    thread_pool = multiprocessing.pool.ThreadPool(100)
    vals = range(100)
    worker = partial(calc, y=1, pool=pool)
    results = list(thread_pool.imap_unordered(worker, vals))
    end_time = datetime.datetime.now()
    duration = end_time - start_time
    print('Program took {} seconds to complete'.format(duration.total_seconds()))
    print(results)
    pool.close()
    pool.join()

def calc(x, y, pool):
    result = pool.apply(do_add, (x, y))
    print(result)
    time.sleep(2)
    return result

def do_add(x, y):
    return x + y

if __name__ == "__main__":
    main()

Prints:

4
10
9
6
...
51
27
96
87
21
Program took 2.276529 seconds to complete
[4, 10, 30, 82, 36, 70, 83, 13, 12, 46, 71, 62, 59, 16, 73, 8, 9, 80, 72, 55, 11, 7, 43, 65, 35, 19, 61, 76, 60, 3, 57, 48, 49, 6, 45, 34, 41, 33, 31, 37, 22, 29, 24, 67, 32, 18, 14, 1, 23, 2, 15, 64, 53, 58, 98, 95, 40, 47, 39, 54, 92, 89, 93, 42, 85, 38, 99, 5, 78, 74, 25, 63, 84, 56, 91, 69, 79, 50, 44, 100, 90, 86, 81, 77, 94, 68, 28, 26, 17, 20, 87, 75, 96, 66, 52, 51, 21, 88, 27, 97]

Discussion

If your worker function, calc , really were as trivial as above, ie requiring only a simple addition operation, then I would not bother even using a multiprocessing pool for doing this calculation; I would just use a multithreading pool because it would be less expensive to do the addition in calc compared with submitting a task to the multiprocessing queue. But assuming your actual worker function contains significant CPU requirements, the idea is to break out those CPU calculations into one or more new functions and only retain in your original worker function code that is thread-efficient (eg I/O operations, network operations, sleeping, etc., ie operations that put the thread into a wait state and therefore allow another thread to execute). The CPU-intensive calculations are then done by using method apply on the multiprocessing queue passing the new CPU-intensive functions you have created (or use method apply_async if you have multiple, unrelated calculations that you can perform in parallel).

Why does my multiprocessing code take so long to complete in python 3.9, but not python 2.7. Code improvements?

Question

1 answers

solution1
0 2022-07-29 11:56:44

Why does my multiprocessing code take so long to complete in python 3.9, but not python 2.7. Code improvements?

Question

1 answers

solution1 0 2022-07-29 11:56:44

solution1
0 2022-07-29 11:56:44