简体   繁体   中英

why keras model can't run in multiprocess pool in python?

I want to predict values by using the model I use this code

from keras.models import load_model
import multiprocessing
model = load_model('CNN_MODEL.hdf')
def Net(x):
    print("test")
    return model.predict(x)

X_test=.....(a list )
pool=multiprocessing.Pool(processes=12)
Y=pool.map(Net,X_test)

pool.close()
pool.join()

But it's too slow. The out is

test test test... test with 12 test and then Stuck. And my cpu is 36 core. How to solve it?

Searching around I've discovered this potentially related answer suggesting that Keras can only be utilized in one process: using multiprocessing with theano .

Is there a way to accomplish my goal? A high level description or short example is greatly appreciated.

Note: I've attempted approaches along the lines of passing a graph to the process but failed since it seems tensorflow graphs aren't pickable (related SO post for that here: Tensorflow: Passing a session to a python multiproces s). If there is indeed a way to pass the tensorflow graph/model to the child process then I am open to that as well.

From my experience - the problem lies in loading Keras to one process and then spawning a new process when the keras has been loaded to your main environment. But for some applications (like eg training a mixture of Kerasmodels) it's simply better to have all of this things in one process. So what I advise is the following (a little bit cumbersome - but working for me) approach:

DO NOT LOAD KERAS TO YOUR MAIN ENVIRONMENT . If you want to load Keras / Theano / TensorFlow do it only in the function environment. Eg don't do this:

import keras

def training_function(...):
    ...

but do the following:

def training_function(...):
    import keras
    ...

Run work connected with each model in a separate process : I'm usually creating workers which are making the job (like eg training, tuning, scoring) and I'm running them in separate processes. What is nice about it that whole memory used by this process is completely freed when your process is done. This helps you with loads of memory problems which you usually come across when you are using multiprocessing or even running multiple models in one process. So this looks eg like this:

def _training_worker(train_params):
    import keras
    model = obtain_model(train_params)
    model.fit(train_params)
    send_message_to_main_process(...)

def train_new_model(train_params):
    training_process = multiprocessing.Process(target=_training_worker, args = train_params)
    training_process.start()
    get_message_from_training_process(...)
    training_process.join()

Different approach is simply preparing different scripts for different model actions. But this may cause memory errors especially when your models are memory consuming. NOTE that due to this reason it's better to make your execution strictly sequential.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM