val_accuracy does not increase

Question

Currently I'm trying to train a Keras Sequential Network with pooled output from BERT. The fine tuned BertForSequence Classification yields good results, but using the pooled_output in a Neural Network does not work as intented. As Input data I got 10.000 Values, each consisting of the 768 floats that my BERT-Model provides. I'm trying to do a simple binary classification, so I also got the labels with 1 and 0's.

As you can see my data has a good number of examples for both classes. After shuffling them, I do a normal train test split and create/fit my model with:

model = Sequential()
model.add(Dense(1536, input_shape=(768,), activation='relu'))
model.add(Dense(1536, activation='relu'))
model.add(Dense(1536, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

opt = Adam(learning_rate=0.0001)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])

#Normally with early stopping so quite a few epochs
history = model.fit(train_features, train_labels, epochs=800, batch_size=68, verbose=1, 
validation_split=0.2, callbacks=[])

During training the loss decreases and my accuracy increases as expected. BUT the val_loss increases and the val_accuracy stays the same, Sure I'm overfitting, but I would expect that the val_accuracy increases. at least for a few epochs and then decreaes when I'm overfitting.

Has anyone an Idea what I'm doing wrong? Perhaps 10.000 values aren't enough to generalize?

Answer 1

Model is over fitting as expected but am surprised it starts over fitting on the early epochs which makes me winder if you have some mislabeling in your validation set. At any rate try add changing the model as follows

model = Sequential()
model.add(Dense(1536, input_shape=(768,), activation='relu'))
model.add(Dropout(.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(.3))
model.add(Dense(128, activation='relu'))
model.add(Dropout(.3))
model.add(Dense(1, activation='sigmoid'))

See if this reduces the over fitting problem

Answer 2

It was not just a mislabeling in my validation set, but in my whole data.

I take a sample of 100000 entries

train_df = train_df.sample(frac=1).reset_index(drop=True)
train_df = train_df.iloc[0:100000]

and delete some values

train_df = train_df[train_df['label'] != '-']

after that i set a few values using train_df.at in a loop, but some indices don't exist because i deleted them. train_df.at only throws warnings so I did not see this. Also I mixed.loc and.iloc so in my case i selected.iloc[2:3] but the index 2 does not exist, so it return index 3 wich is on position 2. After that I make my changes and train_df.at fails at inserting on position 2, but my loop goes on. The next iteration.iloc returns index 4 on position 3. My loop then puts the data on index 3 - from now on all my labels are one position off.

val_accuracy does not increase

Question

2 answers

solution1
1 2021-04-08 20:37:07

solution2
0 ACCPTED 2021-04-10 07:59:09

val_accuracy does not increase

Question

2 answers

solution1 1 2021-04-08 20:37:07

solution2 0 ACCPTED 2021-04-10 07:59:09

solution1
1 2021-04-08 20:37:07

solution2
0 ACCPTED 2021-04-10 07:59:09