如何在單個 GPU 上運行多個 keras 程序？

Question

我正在開發一個 python 項目，我需要為每個數據集構建多個 Keras 模型。 在這里，當我運行構建 Keras 模型時，該程序使用了 10% 的 GPU（GTX 1050ti）。

我的問題是我可以 100% 使用我的 GPU 來減少時間嗎？ 或者是否有可能在同一個 GPU 上運行多個程序？

我試圖在單個 gpu 上運行多個程序，但它沒有並行運行，例如，當我運行單個 python 程序時，每個 epoch 需要 5 秒，而如果我為每個 epoch 運行 2 個程序，則持續時間增加到 10 秒，運行多個程序的最佳方法是什么。

提前致謝！！

Answer 1

不確定是否有適當的方法來做到這一點，但這個“gambiarra”似乎可以很好地工作。

制作一個模型，將兩個或多個模型並行連接在一起。 唯一的缺點是：在並行訓練和預測它們時需要相同數量的輸入樣本。

如何與功能 API 模型並行使用兩個模型：

input1 = Input(inputShapeOfModel1)
input2 = Input(inputShapeOfModel2)

output1 = model1(input1)
output2 = model2(input2) #it could be model1 again, using model1 twice in parallel. 

parallelModel = Model([input1,input2], [output1,output2])

您使用此模型進行訓練和預測，傳遞並行輸入和輸出數據：

parallelModel.fit([x_train1, x_train2], [y_train1, y_train2], ...)

工作測試代碼：

from keras.layers import *
from keras.models import Model, Sequential
import numpy as np

#simulating two "existing" models
model1 = Sequential()
model2 = Sequential()

#creating "existing" model 1
model1.add(Conv2D(10,3,activation='tanh', input_shape=(20,20,3)))
model1.add(Flatten())
model1.add(Dense(1,activation='sigmoid'))

#creating "existing" model 2
model2.add(Dense(20, input_shape=(2,)))
model2.add(Dense(3))


#part containing the proposed answer: joining the two models in parallel
inp1 = Input((20,20,3))
inp2 = Input((2,))

out1 = model1(inp1)
out2 = model2(inp2)

model = Model([inp1,inp2],[out1,out2])


#treat the new model as any other model
model.compile(optimizer='adam', loss='mse')

#dummy input data x and y, for models 1 and 2
x1 = np.ones((30,20,20,3))
y1 = np.ones((30,1))
x2 = np.ones((30,2))
y2 = np.ones((30,3))

#training the model and predicting
model.fit([x1,x2],[y1,y2], epochs = 50)
ypred1,ypred2 = model.predict([x1,x2])

print(ypred1.shape)
print(ypred2.shape)

高級解決方案 - 對數據進行分組以提高速度並匹配樣本數量

還有更多優化的空間，因為這種方法將在兩個模型之間同步批次。 因此，如果一個模型比另一個模型快得多，那么快模型將適應慢模型的速度。

此外，如果您有不同數量的批次，您將需要單獨訓練/預測一些剩余數據。

如果您對輸入數據進行分組並在帶有 Lambda 層的模型中使用一些自定義重塑，您也可以解決這些限制，您可以在開始時重塑批量維度，然后在最后恢復它。

例如，如果x1有 300 個樣本， x2有 600 個樣本，您可以重塑輸入和輸出：

x2 = x2.reshape((300,2,....))
y2 = y2.reshape((300,2,....))

在model2之前和之后，您使用：

#before
Lambda(lambda x: K.reshape(x,(-1,....))) #transforms in the inner's model input shape

#after
Lambda(lambda x: K.reshape(x, (-1,2,....))) #transforms in the grouped shape for output

其中....是原始輸入和輸出形狀（不考慮 batch_size）。

那么你就需要思考一下，分組數據同步數據大小還是分組數據同步速度哪個最好。

（與下一個解決方案相比的優勢：您可以輕松地按任意數字分組，例如 2、5、10、200.....）

高級解決方案 - 多次使用同一型號並行雙倍速度

您還可以並行使用相同的模型兩次，例如在此代碼中。 這可能會使其速度加倍。

from keras.layers import *
from keras.models import Model, Sequential
#import keras.backend as K
import numpy as np
#import tensorflow as tf


#simulating two "existing" models
model1 = Sequential()
model2 = Sequential()

#model 1
model1.add(Conv2D(10,3,activation='tanh', input_shape=(20,20,3)))
model1.add(Flatten())
model1.add(Dense(1,activation='sigmoid'))

#model 2
model2.add(Dense(20, input_shape=(2,)))
model2.add(Dense(3))

#joining the models
inp1 = Input((20,20,3))

#two inputs for model 2 (the model we want to run twice as fast)
inp2 = Input((2,))
inp3 = Input((2,))

out1 = model1(inp1)
out2 = model2(inp2) #use model 2 once
out3 = model2(inp3) #use model 2 twice

model = Model([inp1,inp2,inp3],[out1,out2,out3])

model.compile(optimizer='adam', loss='mse')

#dummy data - remember to have two inputs for model 2, not repeated
x1 = np.ones((30,20,20,3))
y1 = np.ones((30,1))
x2 = np.ones((30,2)) #first input for model 2
y2 = np.ones((30,3)) #first output for model 2
x3 = np.zeros((30,2)) #second input for model 2
y3 = np.zeros((30,3)) #second output for model 2

model.fit([x1,x2,x3],[y1,y2,y3], epochs = 50)
ypred1,ypred2,ypred3 = model.predict([x1,x2,x3])

print(ypred1.shape)
print(ypred2.shape)
print(ypred3.shape)

與之前的解決方案相比的優勢：操作數據和自定義重塑的麻煩更少。

如何在單個 GPU 上運行多個 keras 程序？

問題描述

1 個解決方案

解決方案1
7 已采納 2018-05-22 18:54:02

工作測試代碼：

高級解決方案 - 對數據進行分組以提高速度並匹配樣本數量

高級解決方案 - 多次使用同一型號並行雙倍速度

如何在單個 GPU 上運行多個 keras 程序？

問題描述

1 個解決方案

解決方案1 7 已采納 2018-05-22 18:54:02

工作測試代碼：

高級解決方案 - 對數據進行分組以提高速度並匹配樣本數量

高級解決方案 - 多次使用同一型號並行雙倍速度

解決方案1
7 已采納 2018-05-22 18:54:02