[英]Tensorflow neural network is always 50% sure after training
I have just followed a tutorial on neural networks, and I tried to put my knowledge to the test. 我刚刚遵循了有关神经网络的教程,并试图将自己的知识用于测试。 I made a simple XOR logic learning network but for some reason it always returns
0.5
(50% sure). 我制作了一个简单的XOR逻辑学习网络,但由于某种原因,它总是返回
0.5
(确定0.5
50%)。 Here is my code: 这是我的代码:
import tensorflow as tf
import numpy as np
def random_normal(shape=1):
return (np.random.random(shape) - 0.5) * 2
train_x = np.array([[1, 0], [0, 1], [1, 1], [0, 0]])
train_y = np.array([1, 1, 0, 0])
input_size = 2
hidden_size = 16
output_size = 1
x = tf.placeholder(dtype=tf.float32, name="X")
y = tf.placeholder(dtype=tf.float32, name="Y")
W1 = tf.Variable(random_normal((input_size, hidden_size)), dtype=tf.float32, name="W1")
W2 = tf.Variable(random_normal((hidden_size, output_size)), dtype=tf.float32, name="W2")
b1 = tf.Variable(random_normal(hidden_size), dtype=tf.float32, name="b1")
b2 = tf.Variable(random_normal(output_size), dtype=tf.float32, name="b2")
l1 = tf.sigmoid(tf.add(tf.matmul(x, W1), b1), name="l1")
result = tf.sigmoid(tf.add(tf.matmul(l1, W2), b2), name="l2")
r_squared = tf.square(result - y)
loss = tf.reduce_mean(r_squared)
optimizer = tf.train.GradientDescentOptimizer(0.1)
train = optimizer.minimize(loss)
hm_epochs = 10000
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for itr in range(hm_epochs):
sess.run(train, {x: train_x, y: train_y})
if itr % 100 == 0:
print("Epoch {} done".format(itr))
print(sess.run(result, {x: [[1, 0]]}))
Sorry if this is a bad question, I am new to machine learning. 抱歉,如果这是一个不好的问题,我是机器学习的新手。
Your neural network is actually correct and the answer may surprise you. 您的神经网络实际上是正确的,答案可能会让您感到惊讶。 Change...
更改...
train_x = np.array([[1, 0], [0, 1], [1, 1], [0, 0]])
train_y = np.array([1, 1, 0, 0])
to... 至...
train_x = np.array([[1, 0], [0, 1], [1, 1], [0, 0]]).reshape((4, 2))
train_y = np.array([1, 1, 0, 0]).reshape((4, 1))
You can check that np.array([1, 1, 0, 0]).shape
is (4,)
, not (4, 1)
. 您可以检查
np.array([1, 1, 0, 0]).shape
是(4,)
,不是(4, 1)
。 As a result, the shape of y
becomes (4,)
as well and thus the shape of result - y
is (4, 4)
! 结果,
y
的形状也变为(4,)
,因此result - y
的形状为(4, 4)
! In other words, the loss computes 16 differences that have nothing to do with the actual comparison of the prediction and the label. 换句话说,损失计算的16个差异与预测和标签的实际比较无关 。 So my suggestion for the future: always specify the shape of the placeholders explicitly to find those bugs easier.
因此,我对未来的建议是:始终明确指定占位符的形状,以便更轻松地发现这些错误。
You can find the complete code in this GitHub gist I created. 您可以在我创建的GitHub gist中找到完整的代码。 One more remark: the last sigmoid makes it actually harder to learn
[0, 1]
output. 还有一点要注意:最后一个S形实际上使学习
[0, 1]
输出变得更加困难 。 If you remove it, the network converges much faster. 如果删除它,则网络收敛速度会更快。
import tensorflow as tf
import keras
import numpy as np
seed = 128
train_x = np.array([[1, 0], [0, 1], [1, 1], [0, 0]])
train_y = np.array([1, 1, 0, 0])
test_x = np.array([[1, 0], [0, 1], [1, 1], [0, 0]])
test_y = np.array([1, 1, 0, 0])
num_classes = 2
y_train_binary = keras.utils.to_categorical(train_y, num_classes)
y_test_binary = keras.utils.to_categorical(test_y, num_classes)
def random_normal(shape=1):
return (np.random.random(shape) - 0.5) * 2
n_hidden_1 = 16
n_input = train_x.shape[1]
n_classes = y_train_binary.shape[1]
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
'out': tf.Variable(tf.random_normal([n_hidden_1, n_classes]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
keep_prob = tf.placeholder("float")
training_epochs = 500
display_step = 100
batch_size = 1
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
def multilayer_perceptron(x, weights, biases):
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
out_layer = tf.matmul(layer_1, weights['out']) + biases['out']
return out_layer
predictions = multilayer_perceptron(x, weights, biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predictions, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.1).minimize(cost)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for epoch in range(training_epochs):
avg_cost = 0.0
total_batch = int(len(train_x) / batch_size)
x_batches = np.array_split(train_x, total_batch)
y_batches = np.array_split(y_train_binary, total_batch)
for i in range(total_batch):
batch_x, batch_y = x_batches[i], y_batches[i]
_, c = sess.run([optimizer, cost],
feed_dict={x: batch_x, y: batch_y})
avg_cost += c / total_batch
if epoch % display_step == 0:
print("Epoch:", '%04d' % (epoch+1), "cost={:.9f}".format(avg_cost))
print("Optimization Finished!")
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Accuracy:", accuracy.eval({x: test_x, y: y_test_binary}, session=sess))
Epoch: 0001 cost=3.069790050
纪元:0001费用= 3.069790050
Epoch: 0101 cost=0.001279908纪元:0101费用= 0.001279908
Epoch: 0201 cost=0.000363608纪元:0201费用= 0.000363608
Epoch: 0301 cost=0.000168160纪元:0301费用= 0.000168160
Epoch: 0401 cost=0.000095065纪元:0401费用= 0.000095065
Optimization Finished!优化完成!
Accuracy: 1.0准确度:1.0
test_input = [0, 1]
'Label: ', np.argmax(sess.run(predictions , feed_dict={ x:[test_input]}))
('Label: ', 1)
(“标签:”,1)
For such a simple case you can use Keras to quickly test and see if the dataset is well suited for a neural network. 对于这种简单的情况,您可以使用Keras快速测试并查看数据集是否非常适合神经网络。 However you will need to simulate more data for the network to be sufficiently tuned.
但是,您将需要模拟更多数据以充分调整网络。 I do not think that the gradient descent algorithm is capable of finding an optimal point using the backpropagation of only 4 instances.
我认为梯度下降算法仅使用4个实例的反向传播就无法找到最佳点。
Let's simulate more data 让我们模拟更多数据
n = 1000
X_train = np.zeros((n, 2))
y_train = np.zeros((n,))
X_test = np.zeros((n//3, 2))
y_test = np.zeros((n//3,))
for i in range(n):
if n%3 == 0:
a, b = np.random.randint(0,2), np.random.randint(0,2)
X_test[i, 0], X_test[i, 1] = a, b
y_test[i] = (a and not b) or (not a and b)
a, b = np.random.randint(0,2), np.random.randint(0,2)
X_train[i, 0], X_train[i, 1] = a, b
y_train[i] = (a and not b) or (not a and b)
num_classes = 2
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
input_shape = (2,)
Now let's build our model 现在让我们建立模型
model = Sequential()
model.add(Dense(16, activation='relu',input_shape=input_shape))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
history=model.fit(X_train,
y_train_binary,
epochs=10,
batch_size=8,
validation_data=(X_test, y_test_binary))
This will result in 100% accuracy. 这将导致100%的准确性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.