简体   繁体   中英

Keras: specify input dropout layer that always keeps certain features

I'm training a neural net using Keras in Python for time-series climate data (predicting value X at time t=T), and tried adding a (20%) dropout layer on the inputs, which seemed to limit overfitting and cause a slight increase in performance. However, after I added a new and particularly useful feature (the value of the response variable at time of prediction t=0), I found massively increased performance by removing the dropout layer. This makes sense to me, since I can imagine how the neural net would "learn" the importance of that one feature and base the rest of its training around adjusting that value (ie, "how do these other features affect how the response at t=0 changes by time t=T").

In addition, there are a few other features that I think should be present for all epochs. That said, I am still hopeful that a dropout layer could improve the model performance-- it just needs to not drop out certain features, like X at t_0: I need a dropout layer that will only drop out certain features.

I have searched for examples of doing this, and read the Keras documentation here , but can't seem to find a way to do it. I may be missing something obvious, as I'm still not familiar with how to manually edit layers. Any help would be appreciated. Thanks!

Edit: sorry for any lack of clarity. Here is the code where I define the model (p is the number of features):

def create_model(p):
   model = Sequential()
   model.add(Dropout(0.2, input_shape=(p,))) # % of features dropped
   model.add(Dense(1000, input_dim=p, kernel_initializer='normal'
                , activation='sigmoid'))
   model.add(Dense(30, kernel_initializer='normal', activation='relu'))
   model.add(Dense(1, kernel_initializer='normal',activation='linear'))
   model.compile(loss=cost_fn, optimizer='adam')
return model

The best way I can think of applying dropout only to specific features is to simply separate the features in different layers.

For that, I suggest you simply divide your inputs in essential features and droppable features:

from keras.layers import *
from keras.models import Model

def create_model(essentialP,droppableP):
    essentialInput = Input((essentialP,))
    droppableInput = Input((droppableP,))

    dropped = Dropout(0.2)(droppableInput) # % of features dropped
    completeInput = Concatenate()([essentialInput,droppableInput])        

    output = Dense(1000, kernel_initializer='normal', activation='sigmoid')(completeInput)
    output = Dense(30, kernel_initializer='normal', activation='relu')(output)
    output = Dense(1, kernel_initializer='normal',activation='linear')(output)

    model = Model([essentialInput,droppableInput],output)
    model.compile(loss=cost_fn, optimizer='adam')

    return model

Train the model using two inputs. You have to manage your inputs before training:

model.fit([essential_train_data,droppable_train_data], predictions, ...)

This question has already an accepted answer but it seems to me you are using dropout in a bad way.

Dropout is only for the hidden layers, not for the input layer !

Dropout act as a regularizer, and prevent the hidden layer complex coadaptation, quoting Hinton paper "Our work extends this idea by showing that dropout can be effectively applied in the hidden layers as well and that it can be interpreted as a form of model averaging" ( http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf )

Dropout can be seen as training several different models with your data and averaging the prediction at test time. If you prevent your models to have all the inputs during training, it will perform badly, especially if one input is crucial. What you want is actually avoid overfitting, meaning you prevent too complex models during the training phase (so each of your models will select the most important features first) before testing. It is common practice to drop some of the features in ensemble learning but it is control and not stochastic like for dropout. It also works for neural networks as hidden layers have (often) way more neurons as inputs, and so dropout follows the law of big numbers, as for a small number of inputs, you can have in some bad case almost all your inputs dropped.

In conlusion: it is a bad practice to use dropout in the input layer of a neural network.

I don't see any harm to using dropout in the input layer. The usage/effect would be a little different than normal of course. The effect would be similar to adding synthetic noise to an input signal; only the feature/pixel/whatever would be entirely unknown[zeroed out] instead of noisy. And inserting synthetic noise into the input is one of the oldest ways to improve robustness; certainly not bad practice as long as you think about whether it makes sense for your data set.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM