简体   繁体   English

在Tensorflow中使用神经网络实现异或门的问题

[英]Problems implementing an XOR gate with Neural Nets in Tensorflow

I want to make a trivial neural network, it should just implement the XOR gate. 我想制作一个简单的神经网络,它应该只实现XOR门。 I am using the TensorFlow library, in python. 我在python中使用TensorFlow库。 For an XOR gate, the only data I train with, is the complete truth table, that should be enough right? 对于XOR门,我训练的唯一数据是完整的真值表,应该足够了吗? Over optimization is what I will expect to happen very quickly. 过度优化是我期望很快发生的事情。 Problem with the code is that the weights and biases do not update. 代码问题是权重偏差不会更新。 Somehow it still gives me 100% accuracy with zero for the biases and weights. 不知怎的,它仍然给我100%的准确度,偏差和重量为零。

x = tf.placeholder("float", [None, 2])
W = tf.Variable(tf.zeros([2,2]))
b = tf.Variable(tf.zeros([2]))

y = tf.nn.softmax(tf.matmul(x,W) + b)

y_ = tf.placeholder("float", [None,1])


print "Done init"

cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.75).minimize(cross_entropy)

print "Done loading vars"

init = tf.initialize_all_variables()
print "Done: Initializing variables"

sess = tf.Session()
sess.run(init)
print "Done: Session started"

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])


acc=0.0
while acc<0.85:
  for i in range(500):
      sess.run(train_step, feed_dict={x: xTrain, y_: yTrain})


  print b.eval(sess)
  print W.eval(sess)


  print "Done training"


  correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

  accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

  print "Result:"
  acc= sess.run(accuracy, feed_dict={x: xTrain, y_: yTrain})
  print acc

B0 = b.eval(sess)[0]
B1 = b.eval(sess)[1]
W00 = W.eval(sess)[0][0]
W01 = W.eval(sess)[0][1]
W10 = W.eval(sess)[1][0]
W11 = W.eval(sess)[1][1]

for A,B in product([0,1],[0,1]):
  top = W00*A + W01*A + B0
  bottom = W10*B + W11*B + B1
  print "A:",A," B:",B
  # print "Top",top," Bottom: ", bottom
  print "Sum:",top+bottom

I am following the tutorial from http://tensorflow.org/tutorials/mnist/beginners/index.md#softmax_regressions and in the final for-loop I am printing the results form the matrix(as described in the link). 我正在按照http://tensorflow.org/tutorials/mnist/beginners/index.md#softmax_regressions中的教程进行操作,在最终的for循环中,我将结果从矩阵中打印出来(如链接中所述)。

Can anybody point out my error and what I should do to fix it? 任何人都可以指出我的错误以及我应该怎么做才能解决它?

There are a few issues with your program. 您的计划存在一些问题。

The first issue is that the function you're learning isn't XOR - it's NOR. 第一个问题是你正在学习的功能不是XOR - 它是NOR。 The lines: 线条:

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])

...should be: ...应该:

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[0], [1], [1], [0]])

The next big issue is that the network you've designed isn't capable of learning XOR . 下一个重要问题是您设计的网络无法学习XOR You'll need to use a non-linear function (such as tf.nn.relu() and define at least one more layer to learn the XOR function. For example: 您需要使用非线性函数(例如tf.nn.relu()并至少定义一个层来学习XOR函数。例如:

x = tf.placeholder("float", [None, 2])
W_hidden = tf.Variable(...)
b_hidden = tf.Variable(...)
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)

W_logits = tf.Variable(...)
b_logits = tf.Variable(...)
logits = tf.matmul(hidden, W_logits) + b_logits

A further issue is that initializing the weights to zero will prevent your network from training . 另一个问题是将权重初始化为零将阻止您的网络进行培训 Typically, you should initialize your weights randomly, and your biases to zero. 通常,您应该随机初始化权重,并将偏差归零。 Here's one popular way to do it: 这是一种流行的方式:

HIDDEN_NODES = 2

W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))

W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))

Putting it all together, and using TensorFlow routines for cross-entropy (with a one-hot encoding of yTrain for convenience), here's a program that learns XOR: 将所有内容放在一起,并使用TensorFlow例程进行交叉熵(为方便起见,使用yTrain的单热编码),这是一个学习XOR的程序:

import math
import tensorflow as tf
import numpy as np

HIDDEN_NODES = 10

x = tf.placeholder(tf.float32, [None, 2])
W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)

W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))
logits = tf.matmul(hidden, W_logits) + b_logits

y = tf.nn.softmax(logits)

y_input = tf.placeholder(tf.float32, [None, 2])

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, y_input)
loss = tf.reduce_mean(cross_entropy)

train_op = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

init_op = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init_op)

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1, 0], [0, 1], [0, 1], [1, 0]])

for i in xrange(500):
  _, loss_val = sess.run([train_op, loss], feed_dict={x: xTrain, y_input: yTrain})

  if i % 10 == 0:
    print "Step:", i, "Current loss:", loss_val
    for x_input in [[0, 0], [0, 1], [1, 0], [1, 1]]:
      print x_input, sess.run(y, feed_dict={x: [x_input]})

Note that this is probably not the most efficient neural network for computing XOR, so suggestions for tweaking the parameters are welcome! 请注意,这可能不是用于计算XOR的最有效的神经网络,因此欢迎调整参数的建议!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM