简体   繁体   中英

How much should GPU performance be when training keras model?

I am using RTX 2080 ti on mnist data set. I have install tensrflow-gpu. It's is almost 12 times faster than running on only CPU in other enviroment.

I am checking task manager CPU and GPU performance while training. Here are the performance during training:

GPU enviroment: CPU =20% GPU = 10% training time = 24 sec

CPU enviroment: CPU =100% GPU = 10% training time = 500 sec

I would like to know if GPU running on 10% is normal? and can I increase the performance manually or decrease it?

It depends on your application. It is not unusual to have low GPU utilization. Try increasing the batch size for more utilization.

Being said that, MNIST size.networks are tiny and it's hard to achieve high GPU (or CPU) efficiency for them, I think 10% utilization along with CPU is not unusual for your application. You will get higher computational efficiency with larger batch size, meaning you can process more examples per second, but you will also get lower statistical efficiency, meaning you need to process more examples total to get to target accuracy. So it's a trade-off. For tiny character models, the statistical efficiency drops off very quickly after a 100, so it's probably not worth trying to grow the batch size for training. For inference, you should use the largest batch size you can.

You can also set device type to be used in your program. In your case, force your program to use just the GPU and verify the GPU Utilization.

For example in the program to use GPU only for model.fit

%tensorflow_version 2.x
print(tf.__version__)
# MLP for Pima Indians Dataset saved to single file
import numpy as np
from numpy import loadtxt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import model_from_json

# load pima indians dataset
dataset = np.loadtxt("/content/pima-indians-diabetes.csv", delimiter=",")

# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

# define model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Model Summary
model.summary()

# Fit the model
with tf.device("/device:GPU:0"):
  model.fit(X, Y, epochs=150, batch_size=10, verbose=0)

# evaluate the model
scores = model.evaluate(X, Y, verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

Output -

2.2.0
Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_18 (Dense)             (None, 12)                108       
_________________________________________________________________
dense_19 (Dense)             (None, 8)                 104       
_________________________________________________________________
dense_20 (Dense)             (None, 1)                 9         
=================================================================
Total params: 221
Trainable params: 221
Non-trainable params: 0
_________________________________________________________________
accuracy: 78.39%

Hope this answers your question. Happy Learning.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM