简体   繁体   English

Keras 多处理 model 预测

[英]Keras multiprocessing model prediction

I have a simple MNIST Keras model to make predictions and save the loss.我有一个简单的 MNIST Keras model 来进行预测并保存损失。 I am running on a server with multiple CPUs, so I want to use multiprocessing for speedup.我在具有多个 CPU 的服务器上运行,所以我想使用多处理来加速。

I have successfully used multiprocessing with some basic functions, but for model prediction these processes never finish, while using the non-multiprocessing approach, they work fine.我已经成功地使用了具有一些基本功能的多处理,但是对于 model 预测,这些过程永远不会完成,而使用非多处理方法时,它们工作正常。

I suspect that the issue might be with the model, as there is a single model it cannot be used in different parallel processes, so I loaded the model in each process, but it did not work.我怀疑问题可能出在 model 上,因为只有一个 model 它不能在不同的并行进程中使用,所以我在每个进程中加载了 model,但它没有用。

My code is this:我的代码是这样的:

from multiprocessing import Process
import tensorflow as tf

#make a prediction on a training sample
def predict(idx, return_dict):
  x = tf.convert_to_tensor(np.expand_dims(x_train[idx],axis=0))

  local_model=tf.keras.models.load_model('model.h5')
  y=local_model(x)
  print('this never gets printed')
  y_expanded=np.expand_dims(y_train[train_idx],axis=0)
  loss=tf.keras.losses.CategoricalCrossentropy(y_expanded,y)
  return_dict[i]=loss

manager = multiprocessing.Manager()
return_dict = manager.dict()
jobs = []

for i in range(10):
    p = Process(target=predict, args=(i, return_dict))
    jobs.append(p)
    p.start()
    
for proc in jobs:
    proc.join()

print(return_dict.values())

The print line in the predict function is never shown and the problem is with the model. Even without loading the model in the function and using a global one, the problem still persisted. predict function 中的打印行从未显示,问题出在 model。即使没有在 function 中加载 model 并使用全局的,问题仍然存在。

I followed this this thread but it did not work.我关注了这个线程,但它没有用。 My questions are now these:我现在的问题是:

  1. How to solve the model issue model问题如何解决
  2. Can I use the same X_train for all the processes?我可以对所有进程使用相同的X_train吗?

Thanks.谢谢。

I found the answer.我找到了答案。 First of all, Keras has issues with multiprocessing 1 , 2 .首先, Keras 与 multiprocessing 1 , 2有问题。 Moreover, TensorFlow should always have one session. So, it must be imported only in the function, not anywhere else.此外,TensorFlow 应该总是有一个 session。因此,它必须只在 function 中导入,而不是其他任何地方。 And the model should be loaded from the disk in each function. This may be the source of improvement (moving the model to RAM, serializing the model as a file, and passing it to function).并且model应该在每个function中从磁盘加载。这可能是改进的来源(将model移动到RAM,将model序列化为文件,并将其传递给函数)。

Nevertheless, the below code works :尽管如此,以下代码仍然有效

def predict(idx, return_dict):

  import tensorflow as tf

  x=tf.convert_to_tensor(x_train[idx])
  cce = tf.keras.losses.CategoricalCrossentropy()

  local=tf.keras.models.load_model('model.h5')

  y=local(np.expand_dims(x,axis=0))
  y_expanded=np.expand_dims(y_train[train_idx],axis=0)
  loss=cce(y_expanded,y)
   
  return_dict[idx]=loss

The same x_train may be used.可以使用相同的x_train

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM