简体   繁体   中英

Why is keras accuracy and loss not changing between epochs and how to fix

I'm trying to train a model like the following:

input1 = np.array([[2], [1], [4], [3], [5]])
input2 = np.array([[2, 1, 8, 4], [2, 6, 1, 9], [7, 3, 1, 4], [3, 1, 6, 10], [3, 2, 7, 5]])
outputs = np.array([[3,3,1,0], [3,3,3,0], [3,3,4,0], [3,3,1,0], [3,3,4,0]])

merged = np.column_stack([input1, input2])
model = keras.Sequential([
    keras.layers.Dense(2, input_dim=5, activation='relu'),
    keras.layers.Dense(2, activation='relu'),
    keras.layers.Dense(4, activation='sigmoid'),
])

model.compile(
    loss="mean_squared_error", optimizer="adam", metrics=["accuracy"]
)

model.fit(merged, outputs, batch_size=16, epochs = 100)

This results in an accuracy of 0.6000 and a loss of about 4.6 and these don't change between epochs.

Why is this, and how can I get it to work?

I've tried changing the optimizer and loss functions to a few various.

Your model is too simple to fit to the non-linear data
This model might work out

model = keras.Sequential([
    keras.layers.Dense(20, input_dim=5, activation='relu'),
    keras.layers.Dense(15, activation='relu'),
    keras.layers.Dense(10, activation='relu'),
    keras.layers.Dense(4, activation='relu'),
])

In the final Dense layer, you selected a sigmoid activation function, which has a range of 0 to 1, but your target values are not. This is another reason you are facing low accuracy. So, changing the activation to relu will fix it.

OK I have found the reason for my issue, Thanks to the other answers and comments above and some further reading. I found out I need to use OneHotEncoding to convert to binary, and also reduce the batch_size to 1. This is my code now and this does a better job and reduces the loss.

import keras
from keras.backend import batch_normalization
from keras.preprocessing.text import Tokenizer
from keras.optimizers import SGD
from sklearn.preprocessing import OneHotEncoder
import numpy as np

input1 = np.array([[2], [1], [4], [3], [5]])
input2 = np.array([[2, 1, 8, 4], [2, 6, 1, 9], [7, 3, 1, 4], [3, 1, 6, 10], [3, 2, 7, 5]])
outputs = np.array([[3,3,1,0], [3,3,3,0], [3,3,4,0], [3,3,1,0], [3,3,4,0]])
merged = np.column_stack([input1, input2])
ohe = OneHotEncoder()
x = ohe.fit_transform(merged).toarray()
y = ohe.fit_transform(outputs).toarray()


model = keras.Sequential([
    keras.layers.Dense(30, input_dim=20, activation='relu'),
    keras.layers.Dense(20, activation='relu'),
    keras.layers.Dense(15, activation='relu'),
    keras.layers.Dense(10, activation='relu'),
    keras.layers.Dense(6, activation='sigmoid')
    ])

model.compile(loss = "binary_crossentropy", optimizer = 'adam')
model.fit(x, y,  batch_size=1, epochs = 100)

This works and answers the question. But it doesn't appear to actually solve my problem and work for my use case. That's another topic though so I've asked another question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM