简体   繁体   中英

How to choose dimensionality of the Dense layer in LSTM?

I have a task of multi-label text classification. My dataset has 1369 classes:

# data shape
print(X_train.shape)
print(X_test.shape)
print(Y_train.shape)
print(Y_test.shape)
(54629, 500)
(23413, 500)
(54629, 1369)
(23413, 1369)

For this task, I've decided to use LSTM NN with the next parameters:

# define model
maxlen = 400
inp = Input(shape=(maxlen, ))
embed_size = 128
x = Embedding(max_features, embed_size)(inp)
x = LSTM(60, return_sequences=True,name='lstm_layer')(x)
x = GlobalMaxPool1D()(x)
x = Dropout(0.1)(x)
x = Dense(2000, activation="relu")(x)
x = Dropout(0.1)(x)
x = Dense(1369, activation="sigmoid")(x)
model = Model(inputs=inp, outputs=x)
model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy']
batch_size = 32
epochs = 2
model.fit(X_train, Y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Question : Are there any scientific methods for determining Dense and LSTM dimensionality (in my example, LSTM dimension=60 , I Dense dimension=2000 , and II Dense dimension=1369 )?

If there are no scientific methods, maybe there are some heuristics or tips on how to do this with data with similar dimension.

I randomly chose these parameters. I would like to improve the accuracy of the model and correctly approach to solving similar problems.

I heard that optimizing hyper parameters is an np problem, even there is a better way to do it, it may not worth it for your project given the overhead cost.

For the dimension of LSTM layer, I heard some empirically well working numbers from some conference talks, such as 128 or 256 units and 3 stacked layers. If you can plot your loss along training, and you saw the loss decrease dramatically in the first several epoch but then stopped decreasing, you may want to increase the capacity of your model. This means to make it either deeper or wider. Otherwise, should have less parameters as possible.

For the dimension of dense layer, if your task is many-to-many which means you have a label of certain dimension, then you have to have same number of that dimension as number of units in the dense layer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM