简体   繁体   中英

Python: Working with Pool, apply_async and join

I need to run a shell script that fetches a file from server and writes to the disk based on the arguments that are passed to it. I need to run this script for around 20+ different arguments and therefore need to run it in parallel for each argument.

I tried to accomplish this using Pool , apply_async and join in python. But it does not seem to work. Below is the code:

  import yaml
  from subprocess import call
  from multiprocessing import Pool

  feeder_server_conf = "/etc/feeder-servers.conf"
  with open("feed_conf.yaml", "r") as stream:
        feeds = yaml.load_all(stream)
        pool = Pool()
        for feed in feeds:
            for key, value in feed.items():
                keep_count_present = False
                name = value['name']
                age = value['age']
                if 'keep-count' in value:
                    keep_count = value['keep-count']
                    keep_count_present = True

                print "Pulling feed..."

                if keep_count_present:
                    command = "/usr/bin/pull-feed --name " + name + " --config " + feeder_server_conf + " --age " + str(age) + " --keep-count " + str(keep_count)
                else:
                    command = "/usr/bin/pull-feed --name " + name + " --config " + feeder_server_conf + " --age " + str(age)
                pool.apply_async(call, command.split())

        print "Waiting for pull-feeds to finish..."
        pool.close()
        pool.join()

The code just prints the following output and exits without pulling any of the files. Not sure what I am missing here. Any help is appreciated. Thanks!

Pulling feeds...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Pulling feed...
Waiting for pull-feeds to finish...

The script that I am trying to execute works fine when ran individually (and so is not the culprit). I want the code to wait till all the files are pulled. Not sure what I am missing here. Am I not using the constructs in the correct way? Any help is appreciated. Thanks!

The problem here is a simple one, as described in documentation of subprocess.call call 's first argument is a list , but apply_async function applies the list from it's second argument as function(*args) would, that means your call will end up looking like this call('/usr/bin/pull-feed', '--name', ...) which is not how you use call . All you need to do is replace pool.apply_async(call, command.split()) with pool.apply_async(call, [command.split()]) to pass your command as a list to the first argument of call , the final command, used by apply_async will look like this call(['/usr/bin/pull-feed', '--name']) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM