简体   繁体   中英

How do I make two different things happen at the same time in Python?

I'm used to multiprocessing, but now I have a problem where mp.Pool isn't the tool that I need.

I have a process that prepares input and another process that uses it. I'm not using up all of my cores, so I want to have the two go at the same time, with the first getting the batch ready for the next iteration. How do I do this? And (importantly) what is this sort of thing called, so that I can go and google it?

Here's a dummy example. The following code takes 8 seconds:

import time
def make_input():
    time.sleep(1)
    return "cthulhu r'lyeh wgah'nagl fhtagn"

def make_output(input):
    time.sleep(1)
    return input.upper()

start = time.time()
for i in range(4):
    input = make_input()
    output = make_output(input)
    print(output)

print(time.time() - start)

CTHULHU R'LYEH WGAH'NAGL FHTAGN
CTHULHU R'LYEH WGAH'NAGL FHTAGN
CTHULHU R'LYEH WGAH'NAGL FHTAGN
CTHULHU R'LYEH WGAH'NAGL FHTAGN
8.018263101577759

If I were preparing input batches at the same time as I was doing the output, it would take four seconds. Something like this:

next_input = make_input()
start = time.time()
for i in range(4):
    res = do_at_the_same_time(
        output = make_output(next_input),
        next_input = make_input()
    )
    print(output)

print(time.time() - start)

But, obviously, that doesn't work. How can I accomplish what I'm trying to accomplish?

Important note: I tried the following, but it failed because the executing worker was working in the wrong scope (like, for my actual use-case). In my dummy use-case, it doesn't work because it prints in a different process.

def proc(i):
    if i == 0:
        return make_input()
    if i == 1:
        return make_output(next_input)

next_input = make_input()
for i in range(4):
    pool = mp.Pool(2)
    next_input = pool.map(proc, [0, 1])[0]
    pool.close()

So I need a solution where the second processes happens in the same scope or environment as the for loop, and where the first has output that can be gotten from that scope.

You should be able to use Pool . If I understand it correctly, you want one worker to prepare the input for the next worker which runs and does something more with it, given your example functions, this should do just that:

pool = mp.Pool(2)
for i in range(4):
    next_input = pool.apply(make_input)
    pool.apply_async(make_output, (next_input, ), callback=print)
pool.close()
pool.join()

We prepare a pool with 2 workers, now we want run the loop to run our pair of tasks twice.

We delegate make_input to a worker using apply() waiting for the function to complete assign the result to next_input . Note: in this example we could have used a single worker pool and just run next_input = make_input() (ie in the same process your script runs in and just delegate the make_output() ).

Now the more interesting bit: by using apply_async() we ask a worker to run make_output , passing single parameter next_input to it and telling it to runt (or any function) print with the result of make_output as argument passed to the function registered with callback .

Then we close() the pool not accepting any more jobs and join() to wait for processes to complete their jobs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM