简体   繁体   English

Colab 未使用 GPU 进行培训

[英]Colab isn't using GPU for training

I'm training a custom model on GTSRB dataset for traffic sign recognition on Google Colab.我正在 GTSRB 数据集上训练自定义 model 用于 Google Colab 上的交通标志识别。 I've successfully built my model, But when I'm training my model it runs on CPU instead of GPU.我已经成功构建了我的 model,但是当我训练我的 model 时,它在 CPU 而不是 GPU 上运行。 I've selected GPU runtime earlier and using mirrored strategy using keras-Tensorflow.我之前选择了 GPU 运行时并使用 keras-Tensorflow 使用镜像策略。 Any help will be appreciated.任何帮助将不胜感激。 I'm adding snippets of my code.我正在添加我的代码片段。

with strategy.scope():

  test_datagen = ImageDataGenerator(rescale=1./255)
  train_datagen = ImageDataGenerator(rescale=1./255, shear_range=0.2, zoom_range=0.2, 
  horizontal_flip=False)
  train_generator = train_datagen.flow_from_directory('/content/drive/MyDrive/Train',target_size= 
  (64, 64),batch_size=32,class_mode='categorical')
  validation_generator = 
  test_datagen.flow_from_directory(directory='/content/drive/MyDrive/validation', target_size= 
  (64, 64), batch_size=32, class_mode='categorical')
  checkpoint_path = "training_1/cp.ckpt"
  cp_callback = ModelCheckpoint(filepath=checkpoint_path, save_weights_only=True, verbose=1, 
  save_freq='epoch')

with strategy.scope():
  model = training(classes)
  model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=1e-4),metrics=['accuracy'])
model.fit(train_generator, steps_per_epoch=200, epochs=50, validation_data=validation_generator, 
validation_steps=100, callbacks=[cp_callback])

while training I've checked the GPU and CPU usage which is shown in the image:在训练时,我检查了 GPU 和 CPU 使用情况,如图所示:

在此处输入图像描述

Check the status of the GPU while the model is being run using the command nvidia-smi .在 model 使用命令nvidia-smi运行时检查 GPU 的状态。

I have had this before multiple times with multiple platforms (Kaggle, Colab) but only managed to solve this once.我之前在多个平台(Kaggle、Colab)上曾多次遇到过这种情况,但只设法解决了一次。 In that case, it was due to tensorflow_gpu (gpu version of tensorflow) not being installed when I switched tensorflow versions.在那种情况下,这是由于在我切换 tensorflow 版本时未安装 tensorflow_gpu(tensorflow 的 gpu 版本)。 You can check the version of tensorflow_gpu and the version of tensorflow that is installed.您可以检查 tensorflow_gpu 的版本和安装的 tensorflow 的版本。 Make sure that the versions are the same.确保版本相同。 This may be totally off the mark or it might work for you.这可能完全不合时宜,也可能对您有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM