Keras模型根本不学习

Question

My model weights (I output them to weights_before.txt and weights_after.txt ) are precisely the same before and after the training, ie the training doesn't change anything, there's no fitting happening. 我的模型权重（我将它们输出到weights_before.txt和weights_after.txt ）在训练前后是完全相同的，即训练没有任何变化，没有合适的事情发生。

My data look like this (I basically want the model to predict the sign of feature, result is 0 if feature is negative, 1 if positive ): 我的数据如下所示（我基本上希望模型预测特征的符号， 如果特征为负，结果为0，如果为正，结果为1 ）：

,feature,zerosColumn,result
0,-5,0,0
1,5,0,1
2,-3,0,0
3,5,0,1
4,3,0,1
5,3,0,1
6,-3,0,0
...

Brief summary of my approach: 我的方法的简要概述：

Load the data. 加载数据。
Split it column-wise to x (feature) and y (result), split these two row-wise to test and validation sets. 按列将其拆分为x （功能）和y （结果），将这两个行拆分为test集和validation集。
Transform these sets into TimeseriesGenerators (not necessary in this scenario but I want to get this setup working and I don't see any reason why it shouldn't). 将这些集转换为TimeseriesGenerators （在这种情况下不是必需的，但我想使此设置正常工作，我看不出为什么不应该这样做）。
Create and compile simple Sequential model with few Dense layers and softmax activation on its output layer, use binary_crossentropy as loss function. 创建并编译具有很少Dense层并在其输出层上激活softmax简单Sequential模型，并使用binary_crossentropy作为损失函数。
Train the model... nothing happens! 训练模型... 什么都没发生！

Complete code follows: 完整的代码如下：

import keras
import pandas as pd
import numpy as np

np.random.seed(570)

TIMESERIES_LENGTH = 1
TIMESERIES_SAMPLING_RATE = 1
TIMESERIES_BATCH_SIZE = 1024
TEST_SET_RATIO = 0.2  # the portion of total data to be used as test set
VALIDATION_SET_RATIO = 0.2  # the portion of total data to be used as validation set
RESULT_COLUMN_NAME = 'feature'
FEATURE_COLUMN_NAME = 'result'

def create_network(csv_path, save_model):
    before_file = open("weights_before.txt", "w")
    after_file = open("weights_after.txt", "w")

    data = pd.read_csv(csv_path)

    data[RESULT_COLUMN_NAME] = data[RESULT_COLUMN_NAME].shift(1)
    data = data.dropna()

    x = data.ix[:, 1:2]
    y = data.ix[:, 3]

    test_set_length = int(round(len(x) * TEST_SET_RATIO))
    validation_set_length = int(round(len(x) * VALIDATION_SET_RATIO))

    x_train_and_val = x[:-test_set_length]
    y_train_and_val = y[:-test_set_length]
    x_train = x_train_and_val[:-validation_set_length].values
    y_train = y_train_and_val[:-validation_set_length].values
    x_val = x_train_and_val[-validation_set_length:].values
    y_val = y_train_and_val[-validation_set_length:].values


    train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
        x_train,
        y_train,
        length=TIMESERIES_LENGTH,
        sampling_rate=TIMESERIES_SAMPLING_RATE,
        batch_size=TIMESERIES_BATCH_SIZE
    )

    val_gen = keras.preprocessing.sequence.TimeseriesGenerator(
        x_val,
        y_val,
        length=TIMESERIES_LENGTH,
        sampling_rate=TIMESERIES_SAMPLING_RATE,
        batch_size=TIMESERIES_BATCH_SIZE
    )
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(10, activation='relu', input_shape=(TIMESERIES_LENGTH, 1)))
    model.add(keras.layers.Dropout(0.2))
    model.add(keras.layers.Dense(10, activation='relu'))
    model.add(keras.layers.Dropout(0.2))
    model.add(keras.layers.Flatten())
    model.add(keras.layers.Dense(1, activation='softmax'))

    for item in model.get_weights():
        before_file.write("%s\n" % item)

    model.compile(
        loss=keras.losses.binary_crossentropy,
        optimizer="adam",
        metrics=[keras.metrics.binary_accuracy]
    )

    history = model.fit_generator(
        train_gen,
        epochs=10,
        verbose=1,
        validation_data=val_gen
    )

    for item in model.get_weights():
        after_file.write("%s\n" % item)

    before_file.close()
    after_file.close()

create_network("data/sign_data.csv", False)

Do you have any ideas? 你有什么想法？

Answer 1

The problem is that you are using softmax as the activation function of last layer. 问题是您将softmax用作最后一层的激活功能。 Essentially, softmax normalizes its input to make the sum of the elements to be one. 本质上，softmax将其输入归一化以使元素之和为1。 Therefore, if you use it on a layer with only one unit (ie Dense(1,...) ), then it would always output 1. To fix this, change the activation function of last layer to sigmoid which outputs a value in the range (0,1) . 因此，如果在只有一个单位的层上使用它（即Dense(1,...) ），则它将始终输出1。要解决此问题，请将最后一层的激活函数更改为sigmoid ，从而在其中输出一个值。范围(0,1) 。

Keras模型根本不学习

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-12-21 08:55:37

Keras模型根本不学习

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-12-21 08:55:37

解决方案1
2 已采纳 2018-12-21 08:55:37