简体   繁体   English

Keras:指定始终保留某些功能的输入丢失层

[英]Keras: specify input dropout layer that always keeps certain features

I'm training a neural net using Keras in Python for time-series climate data (predicting value X at time t=T), and tried adding a (20%) dropout layer on the inputs, which seemed to limit overfitting and cause a slight increase in performance. 我正在使用Python中的Keras训练神经网络用于时间序列气候数据(在时间t = T时预测值X),并尝试在输入上添加(20%)丢失层,这似乎限制了过度拟合并导致性能略有提高。 However, after I added a new and particularly useful feature (the value of the response variable at time of prediction t=0), I found massively increased performance by removing the dropout layer. 然而,在我添加了一个新的且特别有用的特征(预测时t = 0时响应变量的值)之后,我发现通过去除丢失层大大提高了性能。 This makes sense to me, since I can imagine how the neural net would "learn" the importance of that one feature and base the rest of its training around adjusting that value (ie, "how do these other features affect how the response at t=0 changes by time t=T"). 这对我来说很有意义,因为我可以想象神经网络将如何“学习”这一特征的重要性,并将其余的训练基于调整该值(即“这些其他特征如何影响响​​应如何”) = 0随时间变化t = T“)。

In addition, there are a few other features that I think should be present for all epochs. 此外,我认为还应该为所有时代提供一些其他功能。 That said, I am still hopeful that a dropout layer could improve the model performance-- it just needs to not drop out certain features, like X at t_0: I need a dropout layer that will only drop out certain features. 也就是说,我仍然希望丢失层可以提高模型性能 - 它只需要不丢弃某些功能,比如t_0处的X: 我需要一个只会丢弃某些功能的丢失层。

I have searched for examples of doing this, and read the Keras documentation here , but can't seem to find a way to do it. 我已经搜索了这样做的例子,并在这里阅读了Keras文档,但似乎无法找到方法。 I may be missing something obvious, as I'm still not familiar with how to manually edit layers. 我可能会遗漏一些明显的东西,因为我还不熟悉如何手动编辑图层。 Any help would be appreciated. 任何帮助,将不胜感激。 Thanks! 谢谢!

Edit: sorry for any lack of clarity. 编辑:抱歉任何不清楚。 Here is the code where I define the model (p is the number of features): 这是我定义模型的代码(p是特征的数量):

def create_model(p):
   model = Sequential()
   model.add(Dropout(0.2, input_shape=(p,))) # % of features dropped
   model.add(Dense(1000, input_dim=p, kernel_initializer='normal'
                , activation='sigmoid'))
   model.add(Dense(30, kernel_initializer='normal', activation='relu'))
   model.add(Dense(1, kernel_initializer='normal',activation='linear'))
   model.compile(loss=cost_fn, optimizer='adam')
return model

The best way I can think of applying dropout only to specific features is to simply separate the features in different layers. 我认为仅将丢失应用于特定功能的最佳方法是简单地分离不同层中的功能。

For that, I suggest you simply divide your inputs in essential features and droppable features: 为此,我建议您简单地将输入分为基本功能和可放置功能:

from keras.layers import *
from keras.models import Model

def create_model(essentialP,droppableP):
    essentialInput = Input((essentialP,))
    droppableInput = Input((droppableP,))

    dropped = Dropout(0.2)(droppableInput) # % of features dropped
    completeInput = Concatenate()([essentialInput,droppableInput])        

    output = Dense(1000, kernel_initializer='normal', activation='sigmoid')(completeInput)
    output = Dense(30, kernel_initializer='normal', activation='relu')(output)
    output = Dense(1, kernel_initializer='normal',activation='linear')(output)

    model = Model([essentialInput,droppableInput],output)
    model.compile(loss=cost_fn, optimizer='adam')

    return model

Train the model using two inputs. 使用两个输入训练模型。 You have to manage your inputs before training: 您必须在培训前管理您的输入:

model.fit([essential_train_data,droppable_train_data], predictions, ...)

This question has already an accepted answer but it seems to me you are using dropout in a bad way. 这个问题已经被接受了,但在我看来,你是以糟糕的方式使用辍学。

Dropout is only for the hidden layers, not for the input layer ! Dropout仅适用于隐藏层,不适用于输入层!

Dropout act as a regularizer, and prevent the hidden layer complex coadaptation, quoting Hinton paper "Our work extends this idea by showing that dropout can be effectively applied in the hidden layers as well and that it can be interpreted as a form of model averaging" ( http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf ) 辍学作为一个正规化者,并防止隐藏层复杂的共同适应,引用Hinton论文“我们的工作通过表明辍学可以有效地应用于隐藏层并且可以被解释为模型平均的一种形式来扩展这个想法” ( http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

Dropout can be seen as training several different models with your data and averaging the prediction at test time. 辍学可以看作是用你的数据训练几种不同的模型,并在测试时平均预测。 If you prevent your models to have all the inputs during training, it will perform badly, especially if one input is crucial. 如果你阻止你的模型在训练期间拥有所有输入,它将表现不佳,特别是如果一个输入是至关重要的。 What you want is actually avoid overfitting, meaning you prevent too complex models during the training phase (so each of your models will select the most important features first) before testing. 您想要的实际上是避免过度拟合,这意味着您在训练阶段(因此每个模型将首先选择最重要的特征)在测试之前防止过于复杂的模型。 It is common practice to drop some of the features in ensemble learning but it is control and not stochastic like for dropout. 通常的做法是放弃集成学习中的一些特征,但它是控制而不是像辍学那样随机。 It also works for neural networks as hidden layers have (often) way more neurons as inputs, and so dropout follows the law of big numbers, as for a small number of inputs, you can have in some bad case almost all your inputs dropped. 它也适用于神经网络,因为隐藏层(通常)有更多的神经元作为输入,因此丢失遵循大数定律,对于少量输入,你可以在一些不好的情况下几乎所有输入都丢失。

In conlusion: it is a bad practice to use dropout in the input layer of a neural network. 在结论中:在神经网络的输入层中使用dropout是一种不好的做法。

I don't see any harm to using dropout in the input layer. 我没有看到在输入层使用dropout有任何损害。 The usage/effect would be a little different than normal of course. 当然,使用/效果会与正常情况略有不同。 The effect would be similar to adding synthetic noise to an input signal; 效果类似于将合成噪声添加到输入信号; only the feature/pixel/whatever would be entirely unknown[zeroed out] instead of noisy. 只有特征/像素/任何完全未知的[归零]而不是嘈杂。 And inserting synthetic noise into the input is one of the oldest ways to improve robustness; 将合成噪声插入输入是提高鲁棒性的最古老方法之一; certainly not bad practice as long as you think about whether it makes sense for your data set. 只要你考虑一下它对你的数据集是否有意义,肯定是不错的做法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM