在keras训练中准确性不会改变，损失几乎不会减少

Question

I'm trying to train a neural network to solve picross (aka nonogram) puzzles on a 5*5 grid using keras. 我正在尝试训练神经网络以使用keras在5 * 5网格上解决picross（aka nonogram）难题。 This would mean the network would ideally have multiple correct activations for each training case. 这意味着理想情况下，网络将针对每个训练案例进行多次正确激活。

I've made a way to randomly generate the training data and initialized the neural network, but upon running it, the accuracy of the network never changes, and the loss only slightly decreases: 我已经找到了一种随机生成训练数据并初始化神经网络的方法，但是在运行它时，网络的准确性永远不会改变，并且损失只会略有减少：

Epoch 1/100 100000/100000 [==============================] - 13s 133us/sample - loss: 1.6282 - acc: 0.5001 时代1/100 100000/100000 [=============================]-13s 133us / sample-损失：1.6282-acc ：0.5001

Epoch 2/100 100000/100000 [==============================] - 13s 131us/sample - loss: 1.6233 - acc: 0.5001 时代2/100 100000/100000 [=============================]-13s 131us / sample-损失：1.6233-acc ：0.5001

Epoch 3/100 100000/100000 [==============================] - 13s 132us/sample - loss: 1.6175 - acc: 0.5001 时代3/100 100000/100000 [=============================]-13秒132us / sample-损耗：1.6175-acc ：0.5001

... ...

Epoch 99/100 100000/100000 [==============================] - 14s 136us/sample - loss: 1.4704 - acc: 0.5001 时代99/100 100000/100000 [=============================]-14s 136us / sample-损耗：1.4704-acc ：0.5001

Epoch 100/100 100000/100000 [==============================] - 14s 136us/sample - loss: 1.4696 - acc: 0.5001 时代100/100 100000/100000 [=============================]-14s 136us / sample-损失：1.4696-acc ：0.5001

I'm running this using Jupyter notebook. 我正在使用Jupyter笔记本运行它。

I've been told that using "binary_crossentropy" as a loss function is an ideal one for the problem, but I've no idea how to format the training data labels for this. 有人告诉我，使用“ binary_crossentropy”作为损失函数是解决该问题的理想选择，但是我不知道如何格式化此训练数据标签。 Should it be a list of ones and zeros, or a list of numbers, or an array...? 它应该是一和零的列表，还是数字的列表，还是数组...？

The output layer is 25 neurons, each corresponding to a block on the 5*5 grid. 输出层是25个神经元，每个神经元对应5 * 5网格上的一个块。 They would have a correct activation of 1 or 0 depending on whether that block is empty or not. 它们将正确激活为1或0，具体取决于该块是否为空。

import random
import numpy as np
import tensorflow as tf
from keras.optimizers import SGD

network = tf.keras.models.Sequential()
network.add(tf.keras.layers.Flatten())
network.add(tf.keras.layers.Dense(750, activation=tf.nn.relu))
network.add(tf.keras.layers.Dense(500, activation=tf.nn.relu))
network.add(tf.keras.layers.Dense(100, activation=tf.nn.relu))
network.add(tf.keras.layers.Dense(25, activation=tf.nn.softmax))
network.compile(optimizer='SGD',
             loss='binary_crossentropy',
             metrics=['accuracy'])
network.fit(scaled_x_train, y_train, epochs=100, batch_size=50)

I expected the accuracy to increase as the training goes by, even if only by a little, but the accuracy stays stuck at whatever value it starts off with, and the loss function only decreases a little bit. 我预计精度会随着训练的进行而增加，即使只是一点点，但精度会停留在开始时的任何值，并且损失函数只会稍微减少一点。

Edit: The data given to the inputs of the neural network are the "hints", scaled down to be values between 0 and 1. Here is the code for the creation of the data: 编辑：提供给神经网络输入的数据是“提示”，按比例缩小为0到1之间的值。这是创建数据的代码：

import random
import numpy as np
from sklearn.preprocessing import MinMaxScaler

x_train = []
y_train = []

for m in range(100000):  #creating a data set with m items in it
    grid = [[0,0,0,0,0],[0,0,0,0,0],[0,0,0,0,0],[0,0,0,0,0],[0,0,0,0,0]]
    hints = [[[],[],[],[],[]],[[],[],[],[],[]]]

    for i in range(5):
        for j in range(5):
            grid[i][j] = random.randint(0,1)   #All items in the grid are random, either 0s or 1s


    sub_y_train = []
    for z in range(5):
        for x in range(5):
            sub_y_train.append(grid[z][x])

    sub_y_train = np.array(sub_y_train)
    y_train.append(sub_y_train)         #the grids are added to the data set first



    ##figuring out the hints along the vertical axis
    for i in range(5):
        counter = 0
        for j in range(4):
            if grid[i][j] == 1:
                counter += 1
                if grid[i][j+1] == 0:
                    hints[0][i].append(counter)
                    counter = 0
        if grid[i][4] == 1:
            hints[0][i].append(counter+1)
            counter = 0


    ##figuring out the hints along the horizontal axis
    for i in range(5):
        counter = 0
        for j in range(4):
            if grid[j][i] == 1:
                counter += 1
                if grid[j+1][i] == 0:
                    hints[1][i].append(counter)
                    counter = 0
        if grid[4][i] == 1:
            hints[1][i].append(counter+1)
            counter = 0

    for i in range(2):
        for j in range(5):
            while len(hints[i][j]) != 3:
                hints[i][j].append(0)

    new_hints = []
    for i in range(2):
        for j in range(5):
            for k in range(3):
                new_hints.append(hints[i][j][k])

    new_hints.append(5)

    x_train.append(new_hints)    #Once the hints are created and formalized, they are added to x_train


x_train = np.array(x_train)      #Both x_train and y_train are converted into numpy arrays
y_train = np.array(y_train)



scaler = MinMaxScaler(feature_range=(0,1))
scaled_x_train = scaler.fit_transform((x_train))

for i in range(5):
    print(scaled_x_train[i])
    print(y_train[i])

Answer 1

Peteris was correct, it seems replacing the "softmax" activation function with "sigmoid" on the output layer of the network has now helped the accuracy steadily increase. Peteris是正确的，似乎在网络输出层上用“ Sigmoid”代替了“ softmax”激活功能，现在已经帮助精度稳步提高。 Currently, the network is almost reaching a steady 95% accuracy. 目前，该网络几乎达到了95％的稳定精度。 (Thank you so much, I've been trying to get this working for weeks) （非常感谢，我已经尝试了好几个星期了）

在keras训练中准确性不会改变，损失几乎不会减少

问题描述

1 个解决方案

解决方案1
0 2019-08-12 17:37:00

在keras训练中准确性不会改变，损失几乎不会减少

问题描述

1 个解决方案

解决方案1 0 2019-08-12 17:37:00

解决方案1
0 2019-08-12 17:37:00