简体   繁体   English

为什么keras model不能在python的多进程池中运行?

[英]why keras model can't run in multiprocess pool in python?

I want to predict values by using the model I use this code我想通过使用 model 来预测值我使用这个代码

from keras.models import load_model
import multiprocessing
model = load_model('CNN_MODEL.hdf')
def Net(x):
    print("test")
    return model.predict(x)

X_test=.....(a list )
pool=multiprocessing.Pool(processes=12)
Y=pool.map(Net,X_test)

pool.close()
pool.join()

But it's too slow.但是太慢了。 The out is出处是

test test test... test with 12 test and then Stuck.测试测试测试... 测试 12 次测试,然后卡住。 And my cpu is 36 core.我的cpu是36核的。 How to solve it?如何解决?

Searching around I've discovered this potentially related answer suggesting that Keras can only be utilized in one process: using multiprocessing with theano .环顾四周,我发现了这个可能相关的答案,表明 Keras 只能在一个过程中使用: 将多处理与 theano 一起使用

Is there a way to accomplish my goal?有没有办法实现我的目标? A high level description or short example is greatly appreciated.非常感谢高级描述或简短示例。

Note: I've attempted approaches along the lines of passing a graph to the process but failed since it seems tensorflow graphs aren't pickable (related SO post for that here: Tensorflow: Passing a session to a python multiproces s). Note: I've attempted approaches along the lines of passing a graph to the process but failed since it seems tensorflow graphs aren't pickable (related SO post for that here: Tensorflow: Passing a session to a python multiproces s). If there is indeed a way to pass the tensorflow graph/model to the child process then I am open to that as well.如果确实有办法将 tensorflow 图形/模型传递给子进程,那么我也对此持开放态度。

From my experience - the problem lies in loading Keras to one process and then spawning a new process when the keras has been loaded to your main environment.根据我的经验 - 问题在于将 Keras 加载到一个进程,然后在 keras 加载到您的主环境时生成一个新进程。 But for some applications (like eg training a mixture of Kerasmodels) it's simply better to have all of this things in one process.但对于某些应用程序(例如训练混合 Keras 模型),最好将所有这些东西放在一个进程中。 So what I advise is the following (a little bit cumbersome - but working for me) approach:所以我建议的是以下(有点麻烦 - 但对我有用)方法:

DO NOT LOAD KERAS TO YOUR MAIN ENVIRONMENT .不要将 KERAS 加载到您的主要环境中。 If you want to load Keras / Theano / TensorFlow do it only in the function environment.如果你想加载 Keras / Theano / TensorFlow 只能在 function 环境中进行。 Eg don't do this:例如,不要这样做:

import keras

def training_function(...):
    ...

but do the following:

def training_function(...):
    import keras
    ...

Run work connected with each model in a separate process : I'm usually creating workers which are making the job (like eg training, tuning, scoring) and I'm running them in separate processes.在单独的进程中运行与每个 model 相关的工作:我通常创建正在完成工作的工人(例如培训、调整、评分),并且我在单独的进程中运行它们。 What is nice about it that whole memory used by this process is completely freed when your process is done.当您的过程完成时,该过程使用的整个 memory 完全释放,这有什么好处。 This helps you with loads of memory problems which you usually come across when you are using multiprocessing or even running multiple models in one process.这可以帮助您解决在使用多处理甚至在一个进程中运行多个模型时通常会遇到的大量 memory 问题。 So this looks eg like this:所以这看起来像这样:

def _training_worker(train_params):
    import keras
    model = obtain_model(train_params)
    model.fit(train_params)
    send_message_to_main_process(...)

def train_new_model(train_params):
    training_process = multiprocessing.Process(target=_training_worker, args = train_params)
    training_process.start()
    get_message_from_training_process(...)
    training_process.join()

Different approach is simply preparing different scripts for different model actions.不同的方法只是为不同的 model 动作准备不同的脚本。 But this may cause memory errors especially when your models are memory consuming.但这可能会导致 memory 错误,尤其是当您的模型正在消耗 memory 时。 NOTE that due to this reason it's better to make your execution strictly sequential.请注意,由于这个原因,最好严格按顺序执行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM