简体   繁体   中英

Keras Accuracy and Loss not changing over a large period of epochs

I am trying to create a Convolutional Neural Network to classify what language a certain "word" is from. There are two files ("english_words.txt" and "spanish_words.txt") which each contain about 60,000 words each. I have converted each word into a 29-dimensional vector where each element is a number between 0 and 1. I am training the model for 500 epochs with the optimizer "adam". However, when I train the model, the loss tends to hover around 0.7 and the accuracy around 0.5, and no matter how long I train it for, these metrics will not improve. Here is the code:

import keras
import numpy as np
from keras.layers import Dense
from keras.models import Sequential
import re

train_labels = []
train_data = []

with open("english_words.txt") as words:
    full_words = words.read()
    full_words = full_words.split("\n")

    # all of the labels are just 1.

    # we now need to encode them into 29 dimensional vectors. 
    vector = []
    i = 0
    for word in full_words:
        train_labels.append([1,0])
        for letter in word:
            vector.append((ord(letter) - 96) * (1.0 / 26.0))
            i += 1
        if (i < 29):
            for x in range(0, 29 - i):
                vector.append(0)
        train_data.append(vector)
        vector = []
        i = 0
with open("spanish_words.txt") as words:
    full_words = words.read()
    full_words = full_words.replace(' ', '')

    full_words = full_words.replace('\n', ',')
    full_words = full_words.split(",")
    vector = []
    for word in full_words:
        train_labels.append([0,1])
        for letter in word:
            vector.append((ord(letter) - 96) * (1.0 / 26.0))
            i += 1
        if (i < 29):
            for x in range(0, 29 - i):
                vector.append(0)
        train_data.append(vector)
        vector = []
        i = 0


def shuffle_in_unison(a, b):
    assert len(a) == len(b)
    shuffled_a = np.empty(a.shape, dtype=a.dtype)
    shuffled_b = np.empty(b.shape, dtype=b.dtype)
    permutation = np.random.permutation(len(a))
    for old_index, new_index in enumerate(permutation):
        shuffled_a[new_index] = a[old_index]
        shuffled_b[new_index] = b[old_index]
    return shuffled_a, shuffled_b



train_data = np.asarray(train_data, dtype=np.float32)
train_labels = np.asarray(train_labels, dtype=np.float32)



train_data, train_labels = shuffle_in_unison(train_data, train_labels)

print(train_data.shape, train_labels.shape)
model = Sequential()
model.add(Dense(29, input_shape=(29,)))
model.add(Dense(60))
model.add(Dense(40))
model.add(Dense(25))
model.add(Dense(2))

model.compile(optimizer="adam",
              loss="categorical_crossentropy",
              metrics=["accuracy"])
model.summary()

model.fit(train_data, train_labels, epochs=500, batch_size=128)

model.save("language_predictor.model")

For some extra info, I am running python 3.x with tensorflow 1.15 and keras 1.15 on windows x64.

I can see several potential problems with your code.

  1. You added several Dense layers one after another, but you really need to also include a non-linear activation function with the parameter activation=... . In the absence of any non-linear activation functions, all those fully-connected Dense layers will mathematically collapse into one single linear Dense layer incapable of learning a non-linear decision boundary.

  2. In general, if you see your loss and accuracy not making any improvement or even getting worse, then the first thing to try is to reduce your learning rate.

  3. You don't need to necessarily implement your own shuffling function. The Keras fit() function can do it if you use the shuffle=True parameter.

In addition to the points mentioned by stackoverflowuser2010:

  1. I find this a very good read and highly suggest checking the mentioned points: 37 Reasons why your Neural Network is not working

  2. Center your input data: Compute a component-wise mean vector and subtract it from every input.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM