简体   繁体   English

在 GPU 上的训练比在 CPU 上慢得多 - 为什么以及如何加快速度?

[英]Training on GPU much slower than on CPU - why and how to speed it up?

I am training a Convolutional Neural Network using Google Colab's CPU and GPU.我正在使用 Google Colab 的 CPU 和 GPU 训练卷积神经网络。

This is the architecture of the network:这是网络的架构:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 62, 126, 32)       896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 31, 63, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 29, 61, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 30, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 12, 28, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 6, 14, 64)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 12, 64)         36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 6, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 768)               0         
_________________________________________________________________
dropout (Dropout)            (None, 768)               0         
_________________________________________________________________
lambda (Lambda)              (None, 1, 768)            0         
_________________________________________________________________
dense (Dense)                (None, 1, 256)            196864    
_________________________________________________________________
dense_1 (Dense)              (None, 1, 8)              2056      
_________________________________________________________________
permute (Permute)            (None, 8, 1)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 8, 36)             72        
=================================================================
Total params: 264,560
Trainable params: 264,560
Non-trainable params: 0

So, this is a very small network but a specific output, shape (8, 36) because I want to recognize characters on an image of license plates.所以,这是一个非常小的网络,但是一个特定的 output,形状(8, 36)因为我想识别车牌图像上的字符。

I used this code to train the network:我用这段代码来训练网络:

model.fit_generator(generator=training_generator,
                    validation_data=validation_generator,
                    steps_per_epoch = num_train_samples // 128,
                    validation_steps = num_val_samples // 128,
                    epochs = 10)

The generator resizes the images to (64, 128) .生成器将图像大小调整为(64, 128) This is the code regarding the generator:这是关于生成器的代码:

class DataGenerator(Sequence):

    def __init__(self, x_set, y_set, batch_size):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) *
        self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) *
        self.batch_size]

        return np.array([
            resize(imread(file_name), (64, 128))
               for file_name in batch_x]), np.array(batch_y)

On CPU one epoch takes 70-90 minutes.在 CPU 上,一个 epoch 需要 70-90 分钟。 On GPU (149 Watt) it takes 5 times as long as on CPU.在 GPU(149 瓦)上,它需要的时间是 CPU 上的 5 倍。

  1. Do you know, why it takes so long?你知道,为什么需要这么长时间吗? Is there something wrong with the generator?发电机有问题吗?
  2. Can I speed this process up somehow?我可以以某种方式加快这个过程吗?

Edit: This ist the link to my notebook: https://colab.research.google.com/drive/1ux9E8DhxPxtgaV60WUiYI2ew2s74Xrwh?usp=sharing编辑:这是我笔记本的链接: https://colab.research.google.com/drive/1ux9E8DhxPxtgaV60WUiYI2ew2s74Xrwh?usp=sharing

My data is stored in my Google Drive.我的数据存储在我的 Google Drive 中。 The training data set contains 105 k images and the validation data set 76 k.训练数据集包含 105 k 图像和验证数据集 76 k。 All in all, I have 1.8 GB of data.总而言之,我有 1.8 GB 的数据。

Should I maybe store the data at another place?我应该将数据存储在另一个地方吗?

Thanks a lot!非常感谢!

I think, you did not enable a GPU我想,你没有启用 GPU

在此处输入图像描述

Go to Edit -> Notebook Settings and choose GPU . Go Edit -> Notebook Settings并选择GPU Then click SAVE然后点击SAVE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM