简体   繁体   中英

Run several python programs at the same time

I have python script run.py :

def do(i):
    # doing something with i, that takes time

start_i = sys.argv[1]
end_i = sys.argv[2]
for i in range(start_i, end_i):
    do(i)

Then I run this script:

python run.py 0 1000000

After 30 minutes script is completed. But, it's too long for me.

So, I create bash script run.sh :

python run.py 0 200000 &
python run.py 200000 400000 &
python run.py 400000 600000 &
python run.py 600000 800000 &
python run.py 800000 1000000

Then I run this script:

bash run.sh

After 6 minutes script is completed. Rather good. I'm happy.

But I think, there is another way to solve the problem (without creating bash script), isn't there?

You're looking for the multiprocessing package, and especially the Pool class:

from multiprocessing import Pool
p = Pool(5)  # like in your example, running five separate processes
p.map(do, range(start_i, end_i))

Besides consolidating this into a single command, this has other advantages over your approach of calling python run.py 0 200000 & etc. If some processes take longer than others (and therefore, python run.py 0 200000 might finish before the others), this will make sure all 5 threads keep working until all of them are done.

Note that depending on your computer's architecture, running too many processes at the same time might slow them all down (for starters, it depends on how many cores your processor has, as well as what else you are running at the same time).

You could have your python program create the independent processes, instead of bash doing it, but that's not much different. What is it about your solution that you find deficient?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM