简体   繁体   中英

Multithreading of For loop in python

I am creating some scripts to help my database import while using docker. I currently have a directory filled with data, and I want to import as quickly as possibly.

All of the work done is all single threaded, so I wanted to speed things up by passing off multiple jobs at once to each thread on my server.

This is done by this code I've written.

#!/usr/bin/python
import sys
import subprocess

cities = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"];

for x in cities:
    dockerscript = "docker exec -it worker_1 perl import.pl ./public/%s %s mariadb" % (x,x)
    p = subprocess.Popen(dockerscript, shell=True, stderr=subprocess.PIPE)

This works fine if I have more than 10 cores, each gets its own. What I want to do is set it up, so if I have 4 cores, the first 4 iterations of the dockerscript runs, 1 to 4, and 5 to 10 wait.

Once any of the 1 to 4 completes, 5 is started and so on until it is all completed.

I am just having a harder time figuring out how to do this is python

Thanks

You should use multiprocessing.Pool() which will automatically create one process per core, then submit your jobs to it. Each job will be a function which calls subprocess to start Docker. Note that you need to make sure the jobs are synchronous--ie the Docker command must not return before it is done working.

John already has the answer but there are a couple of subtleties worth mentioning. A thread pool is fine for this application because the thread just spends its time blocked waiting for the subprocess to terminate. And you can use map with chunksize=1 so the pool goes back to the parent to fetch a new job on each iteration.

#!/usr/bin/python
import sys
import subprocess
import multiprocessing.pool

cities = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]

def run_docker(city):
    return subprocess.call(['docker', 'exec', '-it', 'worker_1', 'perl',
        'import.pl', './public/{0}'.format(city), city, 'mariadb'])

pool = multiprocessing.pool.ThreadPool()
results = pool.map(run_docker, cities, chunksize=1)
pool.close()
pool.join()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM