简体   繁体   中英

Inconsistency in Keras Sequential model vs Functional API

I am trying to rewrite a sequential model using the functional API, however when i do so, the model created using the functional API is stuck at a very low accuracy in the fit() phase and also there is no improvement in the accuracy between epoch cycles.

After doing some reading on developing models that provide similar results overtime, I have set the seed values as follows, but still no luck;

import numpy as np
np.random.seed(2017)

from tensorflow import set_random_seed
set_random_seed(2017)

import random as rn
rn.seed(2017) 

The code for model defined using functional API is as follows;

length=257
inputs = Input(shape=(length,))
embedding = Embedding(vocab_size, 100, weights=[embedding_matrix], input_length=257, trainable=False)(inputs)
bilstm = Bidirectional(LSTM(10, return_sequences=True))(embedding)
dense1 = Dense(10, activation='sigmoid')(bilstm)
bilstm2 = Bidirectional(LSTM(10))(dense1)
dense2 = Dense(1)(bilstm2)
output = Activation('relu')(dense2)
new_model = Model(inputs=inputs, outputs=output)

The model created using sequential model is as follows;

model = Sequential()
e = Embedding(vocab_size, 100, weights=[embedding_matrix], input_length=257, trainable=False)
model.add(e)
model.add(Bidirectional(LSTM(10, return_sequences=True)))
model.add(Dense(10, activation='sigmoid'))
model.add(Bidirectional(LSTM(10)))
model.add(Dense(1))
model.add(Activation('relu')) 
model.compile(loss='binary_crossentropy', optimizer='adam', metrics = ['accuracy'])

In both cases I have set shuffle=False in fit() function.

The model.summary() returns identical output, except for the input layer. During the fit() process, the sequential model's accuracy improves in each epoch but the model created using functional API remain at a low value and doesn't improve during the cycles.

I understand that both model would not return same accuracy, but I am wondering why one model's accuracy improves while the other model doesn't improve during the cycles. Note that the x_train and y_train are also identical.

What am I missing here,

The activatoin 'relu' should not be used as the model's output. (This seed will probably make the sequential model stuck as well, but without a seed, you may get lucky/unlucky with any of the models).

This activation has a zero zone that has zero gradients. If your model reaches this zone (and this is specially probable when you have a single neuron) it will completely stop backpropagation.

  • If your model is intended to output between 0 and 1, use Activation('sigmoid') .
  • If it's intended to ouput from 0 to infinity, try the Activation('softplus') .
  • If it's intended to output from 0 to C, use Lambda(lambda x: C*K.sigmoid(x))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM