简体   繁体   English

简单的Lasange神经网络无法正常工作

[英]Simple Lasange Neural Network not working

I'm using the Lasagne package to build a simple 3 layer neural network, and I'm testing it on a very simple dataset (just 4 examples). 我使用Lasagne软件包构建了一个简单的3层神经网络,并在一个非常简单的数据集(仅4个示例)上对其进行了测试。

X = np.array([[0,0,1],
              [0,1,1],
              [1,0,1],
              [1,1,1]])         

y = np.array([[0, 0],[1, 0],[1, 1],[0, 1]])

However it fails to learn this, and results in the prediction: 但是,它无法学习这一点,并导致了预测:

pred = theano.function([input_var], [prediction])
np.round(pred(X), 2)
array([[[ 0.5 ,  0.5 ],
        [ 0.98,  0.02],
        [ 0.25,  0.75],
        [ 0.25,  0.75]]])

Full code: 完整代码:

def build_mlp(input_var=None):
    l_in = lasagne.layers.InputLayer(shape=(None, 3), input_var=input_var)
    l_hid1 = lasagne.layers.DenseLayer(
        l_in, num_units=4,
        nonlinearity=lasagne.nonlinearities.rectify,
        W=lasagne.init.GlorotUniform())
    l_hid2 = lasagne.layers.DenseLayer(
        l_hid1, num_units=4,
        nonlinearity=lasagne.nonlinearities.rectify,
        W=lasagne.init.GlorotUniform())
    l_out = lasagne.layers.DenseLayer(
        l_hid2, num_units=2,
        nonlinearity=lasagne.nonlinearities.softmax)
    return l_out

input_var = T.lmatrix('inputs')
target_var = T.lmatrix('targets')

network = build_mlp(input_var)

prediction = lasagne.layers.get_output(network, deterministic=True)
loss = lasagne.objectives.squared_error(prediction, target_var)
loss = loss.mean()

params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(
    loss, params, learning_rate=0.01, momentum=0.9)

train_fn = theano.function([input_var, target_var], loss, updates=updates)
val_fn = theano.function([input_var, target_var], [loss])

Training: 训练:

num_epochs = 1000
for epoch in range(num_epochs):
    inputs, targets = (X, y)
    train_fn(inputs, targets)   

I'm guessing there might be an issue with the nonlinear functions used in the hidden layers, or with the learning method. 我猜测隐藏层中使用的非线性函数或学习方法可能存在问题。

this is my guess for the problem, 这是我对问题的猜测,

First, I don't know why is there output like [0,0] ? 首先,我不知道为什么会有类似[0,0]输出? is that means that sample not categorize in all classes? 这是否意味着样本未在所有类别中归类?

Second, You are using Softmax in the last layer, that usually use for classification, are you build this network for classification? 其次,您在通常用于分类的最后一层中使用Softmax,您是否正在建立该网络用于分类? if you confuse about the output, the output is actually probability of each class So I think the output is correct: 如果您对输出感到困惑,那么输出实际上就是每个类的概率,因此我认为输出是正确的:

  • second sample prediction is [0.98 0.02] so it means the second sample is belong to first class, like your target [1 0] 第二个样本预测为[0.98 0.02]因此意味着第二个样本属于第一类,就像您的目标[1 0]

  • third sample prediction is [0.25 0.75] so it means the third sample is belong to second class, like your target [1 1] (regardless your first class value, it is classification, so it'll be count as correct classification by system) 第三样本预测值为[0.25 0.75] ,这意味着第三样本属于第二类,就像您的目标[1 1] (无论您的第一类值是分类,因此系统会将其视为正确分类)

  • fourth sample prediction is [0.25 0.75] so it means the fourth sample is belong to second class, like your target [0 1] 第四个样本预测为[0.25 0.75]因此意味着第四个样本属于第二类,就像您的目标[0 1]

  • first sample prediction is [0.5 0.5] this one seems a bit confusing to me, so I guess Lasagne will predict the first sample which have the same probability in each class as not a member of any classes 第一个样本预测为[0.5 0.5]这对我来说有点令人困惑,所以我猜Lasagne将预测第一个样本,该样本在每个类别中的发生概率与未在任何类别中的成员相同

I feel like you can't really judge whether the model is correctly learning based on the above. 我觉得您无法根据以上内容真正判断模型是否在正确学习。

  1. Number of training instances You have 4 training instances. 训练实例数您有4个训练实例。 The neural network you constructed contains 3*4 + 4*4 + 4*2 = 36 weights which it has to learn. 您构建的神经网络包含3 * 4 + 4 * 4 + 4 * 2 = 36个必须学习的权重。 Not to mention you have 4 different types of outputs. 更不用说您有4种不同类型的输出。 The network is definitely underfitting, which may explain unexpected results. 网络肯定不适合,这可能解释了意外的结果。

  2. How to test if a model is working If I wanted to test whether a neural network is correctly learning, I would test on a working dataset (like MNIST) and ensure my model is learning with high probability. 如何测试模型是否正常工作如果我想测试神经网络是否正确学习,我将在正常工作的数据集(如MNIST)上进行测试,并确保我的模型有很高的学习可能性。 You could also try comparing with another neural network library you've already written or with literature. 您也可以尝试与已经编写的另一个神经网络库或文献进行比较。 If I really wanted to go micro, I would use boosting with a linearly separable dataset. 如果我真的想微型化,我将对线性可分离的数据集使用boosting。

If your model still doesn't learn properly, I would be concerned then. 如果您的模型仍然无法正常学习,那么我会担心的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM