简体   繁体   English

LSTM模型的随机性

[英]Randomness of LSTM model

I have one LSTM model like below: 我有一个如下的LSTM模型:

model = Sequential()
model.add(Conv1D(3, 32, input_shape=(60, 12)))
model.add(LSTM(units=256, return_sequences=False, dropout=0.25))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.summary() 

Each time I use the same dataset to train it, I will get a different model. 每次我使用相同的数据集进行训练时,我都会得到一个不同的模型。 Most of the time, the performance of the trained model is acceptable, but sometime is really bad. 在大多数情况下,训练模型的性能是可以接受的,但有时确实很糟糕。 I think that there are some randomness during the training or initialization. 我认为在训练或初始化过程中会有一些随机性。 So how can I fix everything to get same model for each training? 那么,如何解决所有问题以使每次培训都获得相同的模型?

I've experienced this problem with Keras as well, it has to do with the random seed, you can fix your random seed like this before importing the Keras, so that you could get the consistent result. 我在Keras上也遇到了这个问题,这与随机种子有关,可以在导入Keras之前像这样修复随机种子,以便获得一致的结果。

import numpy as np
np.random.seed(1000)

import os
import random

os.environ['PYTHONHASHSEED'] = '0'
random.seed(12345)

# Also set the tf randomness to some fixed values like this if you need:
tf.set_random_seed(1234)

This worked for me. 这对我有用。

Weights are initialized randomly in neural networks, so it is possibly to get different results by design. 权重是在神经网络中随机初始化的,因此有可能通过设计获得不同的结果。 If you think about how backpropagation works and how the cost function is minimized, you will notice that you don´t have any guarantee that your network will find the "global minima". 如果考虑反向传播的工作原理以及成本函数的最小化,您会发现您无法保证网络会找到“全局最小值”。 Fixing the seed is one idea to get reproducible results, but on the other hand you limit your network to a fixed starting position, where it probably will never reach the global minima. 修复种子是获得可重现结果的一个想法,但是另一方面,您将网络限制在固定的起始位置,在该位置可能永远无法达到全局最小值。

A lot of complex models, especially LSTMs are unstable. 许多复杂的模型,尤其是LSTM都是不稳定的。 You could look at convolutional approaches. 您可以看一下卷积方法。 I noticed, they are performing almost equally and are much more stable. 我注意到,它们的性能几乎相同,并且更加稳定。 https://arxiv.org/pdf/1803.01271.pdf https://arxiv.org/pdf/1803.01271.pdf

You can save it 你可以保存

from keras.models import load_model
model.save("lstm_model.h5")

And load it later on 稍后再加载

model = model.load("lstm_model.h5")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM