简体   繁体   中英

While loop won't break in python multiprocessing

As I received no feedback, yet, for this question , I re-conceptualize (and simplify) the problem here. In this simplified case I'm wondering how I can restart a while loop that won't break in a multiprocessing queue.

I want to run a function on several cores using the multiprocessing library.

p = multiprocessing.Pool(5)                     # Start a pool with 5 cores
analysis = p.map(function_to_call,'abcdefghij') # Call function_to_call on every core 
                                                # each time with a different letter from
                                                # string 'abcdefghij' and collect results
print(analysis)

def function_to_call(arg):
    result = []
    time = timeit.default_timer()               # timeit library needed
    time_even = int(str(time)[-1])              # get last number of time
########## below can't be changed ###########
    while True:
        if (time_even % 2) == 0:                # if last number is even, do stuff
            result.append('problem')
        else:                                   # else append arg to result
            result.append(str(arg))
            break 
########## above can't be changed ###########
    print(result) 
    return(result)                              # and return result

The result will be always different with respect to when the script is compiled. In my case, the output in the terminal is:

['b']
['c']
['g']
['h']
['i']
['e']  # <--- note that the process run in parallel and not serial

The conclusion is that the function call gets stucked in the while loop when it's called with the arguments 'a' , 'd' , 'f' , and 'j' (because the last number of the timestamp is obviously even in those cases). However, if I add and False to the if-statement in the while loop so that it always breaks, the following is printed to the terminal, indicating that everything is working perfectly fine (print(result) is compiled):

['a']
['b']
['d']
['c']
['g']
['h']
['f']
['j']
['i']
['e']
[['a'], ['b'], ['c'], ['d'], ['e'], ['f'], ['g'], ['h'], ['i'], ['j']]

In the script I'm working on it can happen that the function_to_call does not return an output in some cases. However, rerunning the function several times will output a result in the end (which I wanted to imitate with the timestamp). Therefore, I would like to adapt my script so that function_to_call is called with the same argument when it does not return an output.

Unfortunately, the function I want to call takes several hours to come to an end. So I don't want to force it to break after some preset time value. I'll be thankful for every comment and all suggestions!

Re-run it with an exponentially increasing timeout:

from stopit import ThreadingTimeout
timeout = 3600 # should probably be set slightly above the expected runtime
while True:
    with ThreadingTimeout(timeout) as timeout_ctx:
        result = function_to_call()
    if timeout_ctx.state != timeout_ctx.TIMED_OUT:
        return result
    timeout *= 2

This way you are certain you won't do more than twice too much work, in average.

NB: I use in my example the stopit library https://github.com/glenfant/stopit

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM