简体   繁体   English

无法训练神经网络解决XOR映射

[英]Cannot train a neural network solving XOR mapping

I am trying to implement a simple classifier for the XOR problem in Keras. 我正在尝试为Keras中的XOR问题实现一个简单的分类器。 Here is the code: 这是代码:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
import numpy

X = numpy.array([[1., 1.], [0., 0.], [1., 0.], [0., 1.], [1., 1.], [0., 0.]])
y = numpy.array([[0.], [0.], [1.], [1.], [0.], [0.]])
model = Sequential()
model.add(Dense(2, input_dim=2, init='uniform', activation='sigmoid'))
model.add(Dense(3, init='uniform', activation='sigmoid'))
model.add(Dense(1, init='uniform', activation='softmax'))
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)

model.fit(X, y, nb_epoch=20)
print()
score = model.evaluate(X, y)
print()
print(score)
print(model.predict(numpy.array([[1, 0]])))
print(model.predict(numpy.array([[0, 0]])))

I tried changing the number of epochs, learning rate, and other parameters. 我尝试改变时代数,学习率和其他参数。 But the error remains constant from the first to the last epoch. 但是从第一个时期到最后一个时期错误保持不变。

Epoch 13/20
6/6 [==============================] - 0s - loss: 0.6667 
Epoch 14/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 15/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 16/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 17/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 18/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 19/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 20/20
6/6 [==============================] - 0s - loss: 0.6667

6/6 [==============================] - 0s

0.666666686535
[[ 1.]]
[[ 1.]]

How do you train this network in Keras? 你如何在Keras训练这个网络?

Also, is there a better library for implementing neural networks? 另外,是否有更好的库来实现神经网络? I tried PyBrain, but it has been abandoned, tried scikit-neuralnetwork but the documentation is really cryptic so couldn't figure out how to train it. 我试过PyBrain,但它已经被放弃了,尝试了scikit-neuralnetwork,但文档真的很神秘,所以无法弄清楚如何训练它。 And I seriously doubt if Keras even works. 我非常怀疑Keras是否有效。

In your example, you have a Dense layer with 1 unit with a softmax activation. 在您的示例中,您有一个带有1个单位且具有softmax激活的Dense图层。 The value of such a unit will always be 1.0, so no information can flow from your inputs to your outputs, and the network won't do anything. 这样一个单位的值总是1.0,所以没有信息可以从你的输入流到你的输出,网络也不会做任何事情。 Softmax is only really useful when you need to generate a prediction of a probability among n classes, where n is greater than 2. Softmax仅在需要生成n个类中概率的预测时才真正有用,其中n大于2。

The other answers suggest changes to the code to make it work. 其他答案建议更改代码以使其工作。 Just removing activation='softmax' may be enough. 只需删除activation='softmax'即可。

Keras does generally work. Keras一般都有用。

Try the last perceptron in the network without an activation function. 在没有激活功能的情况下尝试网络中的最后一个感知器。 I had the same problem and it starts learning when you remove the activation function. 我有同样的问题,当你删除激活功能时它开始学习。

Also, you could try to split the output layer into 2 neurons. 此外,您可以尝试将输出层拆分为2个神经元。 And have the output be [0,1] for 0 and [1,0] for one. 并且输出为[0,1]为0和[1,0]为一。

However, removing the activation function should do the trick. 但是,删除激活功能应该可以解决问题。

This code works for me: 这段代码适合我:

import numpy as np
from keras.models import Sequential
from keras.layers.core import Activation, Dense
from keras.optimizers import SGD

X = np.array([[1, 1], [0, 0], [1, 0], [0, 1], [1, 1], [0, 0]], dtype='uint8')
y = np.array([[0], [0], [1], [1], [0], [0]], dtype='uint8')


model = Sequential()
model.add(Dense(2, input_dim=2))
model.add(Activation('sigmoid'))
model.add(Dense(1))
model.add(Activation('sigmoid'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd, class_mode="binary")

history = model.fit(X, y, nb_epoch=10000, batch_size=4, show_accuracy=True)

print
score = model.evaluate(X,y)
print
print score
print model.predict(np.array([[1, 0]]))
print model.predict(np.array([[0, 0]]))

# X vs y comparison
print
predictions = model.predict(X)
predictions = predictions.T
predictions = [1 if prediction >= 0.5 else 0 for prediction in predictions[0]]
print predictions
print [int(n) for n in y]

Unfortunatelly, I'm beginner in machine learning and I don't know why my code works and your doesn't. 不幸的是,我是机器学习的初学者,我不知道为什么我的代码有效,而你却没有。

I used this code. 我用过这段代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM