极其基础的神经网络无法学习

Question

I've gone through some of the CNTK Python tutorials and I'm trying to write an extremely basic one layer neural network that can compute a logical AND. 我已经看过一些CNTK Python教程，并且试图编写一个可以计算逻辑AND的极其基础的单层神经网络。 I have functioning code, but the network isn't learning - in fact, loss gets worse and worse with each minibatch trained. 我有正常运行的代码，但是网络没有学习-实际上，每批minibatch的损失越来越严重。

import numpy as np
from cntk import Trainer
from cntk.learner import sgd
from cntk import ops
from cntk.utils import get_train_eval_criterion, get_train_loss

input_dimensions = 2
# Define the training set
input_data = np.array([
    [0, 0], 
    [0, 1],
    [1, 0],
    [1, 1]], dtype=np.float32)

# Each index matches with an index in input data
correct_answers = np.array([[0], [0], [0], [1]])

# Create the input layer
net_input = ops.input_variable(2, np.float32)
weights = ops.parameter(shape=(2, 1))
bias = ops.parameter(shape=(1))

network_output = ops.times(net_input, weights) + bias

# Set up training
expected_output = ops.input_variable((1), np.float32)
loss_function = ops.cross_entropy_with_softmax(network_output, expected_output)
eval_error = ops.classification_error(network_output, expected_output)

learner = sgd(network_output.parameters, lr=0.02)
trainer = Trainer(network_output, loss_function, eval_error, [learner])

minibatch_size = 4
num_samples_to_train = 1000
num_minibatches_to_train = int(num_samples_to_train/minibatch_size)
training_progress_output_freq = 20

def print_training_progress(trainer, mb, frequency, verbose=1):
    training_loss, eval_error = "NA", "NA"

    if mb % frequency == 0:
        training_loss = get_train_loss(trainer)
        eval_error = get_train_eval_criterion(trainer)
        if verbose:
            print("Minibatch: {0}, Loss: {1:.4f}, Error: {2:.2f}".format(
            mb, training_loss, eval_error))

    return mb, training_loss, eval_error


for i in range(0, num_minibatches_to_train):
    trainer.train_minibatch({net_input: input_data, expected_output: correct_answers})
    batchsize, loss, error = print_training_progress(trainer, i, training_progress_output_freq, verbose=1)

Sample training output 样本训练输出

Minibatch: 0, Loss: -164.9998, Error: 0.75
Minibatch: 20, Loss: -166.0998, Error: 0.75
Minibatch: 40, Loss: -167.1997, Error: 0.75
Minibatch: 60, Loss: -168.2997, Error: 0.75
Minibatch: 80, Loss: -169.3997, Error: 0.75
Minibatch: 100, Loss: -170.4996, Error: 0.75
Minibatch: 120, Loss: -171.5996, Error: 0.75
Minibatch: 140, Loss: -172.6996, Error: 0.75
Minibatch: 160, Loss: -173.7995, Error: 0.75
Minibatch: 180, Loss: -174.8995, Error: 0.75
Minibatch: 200, Loss: -175.9995, Error: 0.75
Minibatch: 220, Loss: -177.0994, Error: 0.75
Minibatch: 240, Loss: -178.1993, Error: 0.75

I'm not really sure what's going on here. 我不太确定这是怎么回事。 Error is stuck at 0.75 which, I think, means the network is performing the same as it would by chance. 错误停留在0.75，我认为这意味着网络的运行与偶然发生的情况相同。 I'm uncertain whether I've misunderstood a requirement of ANN architecture, or if I'm misusing the library. 我不确定是否误解了ANN体系结构的要求，或者是否滥用了该库。

Any help would be appreciated. 任何帮助，将不胜感激。

Answer 1

You are trying to solve a binary classification problem with a softmax as your final layer. 您正在尝试使用softmax作为最后一层来解决二进制分类问题。 The softmax layer is not the right layer here, it is only effective for multiclass (classes >= 3) problems. 这里的softmax层不是正确的层，它仅对多类（类> = 3）问题有效。

For binary classification problems you should do the following two modifications: 对于二进制分类问题，您应该进行以下两项修改：

Add a sigmoid layer to your output (this will make your output look like a probability) 在输出中添加一个S形图层（这将使您的输出看起来像个概率）
Use binary_cross_entropy as your criterion (you will have to be on at least this release ) 使用binary_cross_entropy作为您的标准（至少您必须使用此发行版）

极其基础的神经网络无法学习

问题描述

Sample training output 样本训练输出

1 个解决方案

解决方案1
3 已采纳 2016-11-12 02:19:02

极其基础的神经网络无法学习

问题描述

Sample training output 样本训练输出

1 个解决方案

解决方案1 3 已采纳 2016-11-12 02:19:02

解决方案1
3 已采纳 2016-11-12 02:19:02