简体   繁体   中英

Python multiprocessing is slower if more packages are imported

I noticed that my code ran slower if I imported some packages (even if didn't use them).

Example1:

import time 
from multiprocessing import Process

def donothing(seconds):
    pass

if __name__ == "__main__":
    processes = []

    start = time.time()
    for x in range(10):
        p = Process(target=donothing,args=[1])
        p.start()
        processes.append(p)

    for p in processes:
        p.join()

    end = time.time()

    finish = time.perf_counter()
    print("Finished running after seconds : (perf_counter){0}  (sub){1}".format(finish,(end-start)))

Output:

Finished running after seconds : (perf_counter)0.19179975  (sub)0.15710902214050293

Example2:

I just add import pandas at the beginning of code, it runs much slower..

Output:

Finished running after seconds : (perf_counter)1.264330583  (sub)0.8281588554382324

Why does it happen, and how to avoid that?

Each process starts a new python interpreter with it's respective import overhead. There are several ways to start a new process depending on you OS. Try to experiment with:

from multiprocessing import set_start_method
set_start_method("fork", force=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM