简体   繁体   English

在训练卷积神经网络(DenseNet)时,是否有任何选项或参数可以改变以减少训练时间?

[英]Are there any options or parameters that I can change to reduce the training time when training the Convolutional neural network (DenseNet)?

I am using the following python code 'ImageAI' to use DenseNet for my research. 我使用以下python代码'ImageAI'来使用DenseNet进行我的研究。

ImageAI github: https://github.com/OlafenwaMoses/ImageAI ImageAI github: https//github.com/OlafenwaMoses/ImageAI

ImageAI example: https://towardsdatascience.com/train-image-recognition-ai-with-5-lines-of-code-8ed0bdd8d9ba ImageAI示例: https ://towardsdatascience.com/train-image-recognition-ai-with-5-lines-of-code-8ed0bdd8d9ba

I am currently doing research related to symbol recognition (2D building drawing symbols) using the CNN approach (DenseNet). 我目前正在使用CNN方法(DenseNet)进行与符号识别(2D建筑绘图符号)相关的研究。 One example of a symbol of a VAV box is: https://ibb.co/cyhwRvf VAV盒符号的一个例子是: https//ibb.co/cyhwRvf

I am trying to classify 39 classes(number of objects which is the number of symbols in this case) and have 2,000 images of each class for training data(2,000 x 39 = 78,000). 我正在尝试对39个类别(在这种情况下是符号数量的对象数量)进行分类,并且每个类别的2,000个图像用于训练数据(2,000 x 39 = 78,000)。 And I have 1,000 images of each for the test data(1,000 x 39 = 39,000). 我有1000张图像用于测试数据(1,000 x 39 = 39,000)。 The total size of the dataset is 1.82 GB (I consider this as a relatively small size but please correct me if I am wrong). 数据集的总大小是1.82 GB(我认为这是一个相对较小的大小,但如果我错了请纠正我)。

But the problem is that the training time is taking so much time. 但问题是训练时间花了很多时间。

I have tried using the GPU (Nvidia Geforce RTX 2080 Ti) and it is taking 3 days to train when I set the epoch(number of experiments) to be 200. 我尝试过使用GPU(Nvidia Geforce RTX 2080 Ti),当我将纪元(实验数量)设置为200时,需要3天才能进行训练。

I would like to know whether if there is a way to reduce the time for training. 我想知道是否有办法减少培训时间。 Is there any parameters that I can change or any other options? 我可以更改任何参数或任何其他选项吗?

Or is this considered normal consumption time considering the size of the dataset and the GPU that I am using? 或者考虑到我使用的数据集和GPU的大小,这被视为正常消耗时间?

The five lines of code for training are the following: 五行培训代码如下:

from imageai.Prediction.Custom import ModelTraining

model_trainer = ModelTraining()

model_trainer.setModelTypeAsDenseNet()

model_trainer.setDataDirectory("mechsymbol")
model_trainer.trainModel(num_objects=39, num_experiments=200, enhance_data=True, batch_size=32, show_network_summary=True)

Im fairly new to ImageAI/Tensorflow but still learning. 我是ImageAI / Tensorflow的新手,但还在学习。

In terms of getting a faster learning using parameters, I think the enhance_data will get you a bit faster learning by setting it to False. 在使用参数获得更快的学习方面,我认为通过将其设置为False, enhance_data可以让您更快地学习。 This parameter should be set to True if you dont have such a large set of images to train on (less than 1000). 如果您没有要训练的大量图像(小于1000),则此参数应设置为True。 This parameter will more samples from existing ones. 此参数将从现有参数中获取更多样本。

enhance_data = False

And also worth mention is that you don not have to do all 200 generation training if you dont get any better results after for example 50 epoch due to overfitting you model. 同样值得一提的是,如果你因为过度拟合模型而在50个时代之后没有得到任何更好的结果,你就不必进行所有200代培训。

Have heard about "mixed precision traning" using FP16 instead of FP32? 听过使用FP16而不是FP32的“混合精确训练”?

I have not tried it myself since it would be a marginal difference on a Nvidia GTX 1080 Ti since it does not have tensor cores. 我自己没有尝试过,因为它没有Nvidia GTX 1080 Ti,因为它没有张量核心。 But you can read more about it here https://medium.com/tensorflow/automatic-mixed-precision-in-tensorflow-for-faster-ai-training-on-nvidia-gpus-6033234b2540 但你可以在这里阅读更多相关信息https://medium.com/tensorflow/automatic-mixed-precision-in-tensorflow-for-faster-ai-training-on-nvidia-gpus-6033234b2540

For reference: 以供参考:

When I do my ResNet training of a image data set with 62 classes (26GB) which has between 1000 and 2000 images for training and 200 for validation it takes about 10 minutes for each epoch. 当我对62个类别(26GB)的图像数据集进行ResNet培训时,其具有1000到2000个用于训练的图像和200个用于验证的图像,每个时期大约需要10分钟。

That would be about 33 hours for a 200 generation training. 对于200代培训,这将是大约33个小时。

This is my imageAi python lines of code 这是我的imageAi python代码行

model_trainer = ModelTraining()
model_trainer.setModelTypeAsResNet()
model_trainer.setDataDirectory('.')
model_trainer.trainModel(num_objects=classes_in_train, num_experiments=100, 
enhance_data=False, batch_size=32)

Found 81086 images belonging to 62 classes.
Found 13503 images belonging to 62 classes.

Epoch 38/100
231/2533 [=>............................] - ETA: 10:00 - loss: 2.9065e-04 - acc: 0.9999

Hope it might give you helpful info. 希望它可能会给你有用的信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Tensorflow 上训练卷积神经网络时 GPU 内存不足 - GPU out of memory when training convolutional neural network on Tensorflow 如何在Keras实施的卷积神经网络的训练过程中修复我的骰子损失计算? - How can I fix my dice loss calculation during the training process of a convolutional neural network implemented in Keras? 如何将.csv训练数据提供给mxnet中的卷积神经网络? - How can I feed .csv training data to a convolutional neural network in mxnet? 如何实时可视化神经网络的训练? - How can I visualize the the training of neural network in real time? 如何在神经网络中进行并行处理以减少训练时间? - How to do parallel processing in the neural network to reduce training time? 在卷积神经网络训练期间定期评估大型测试集 - Regularly evaluating large test sets during Convolutional Neural Network training Theano中卷积神经网络的无监督预训练 - Unsupervised pre-training for convolutional neural network in theano 在PyTorch中训练神经网络时,损失总是“为” - Loss is 'nan' all the time when training the neural network in PyTorch 如何计算或监控pybrain神经网络的训练? - How can I calculate or monitor the training of a neural network in pybrain? 在 sklearn 中训练神经网络时如何保存学习进度? - How can I save the learning progress when training the neural network in sklearn?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM