简体   繁体   English

XOR神经网络2-1-1

[英]XOR neural network 2-1-1

I am trying to implement a XOR in neural networks with the typology of 2 inputs, 1 element in the hidden layer, and 1 output. 我正在尝试使用2个输入,1个隐藏层元素和1个输出的类型在神经网络中实现XOR。 But the learning rate is really bad (0,5). 但是学习率确实很差(0,5)。 I think it is because I am missing a connection between the inputs AND the outputs, but I am not really sure how to do it. 我认为这是因为我缺少输入和输出之间的联系,但是我不确定如何做到这一点。 I have already made the bias connection so that the learning is better. 我已经建立了偏见联系,以便更好地学习。 Only using Numpy. 仅使用Numpy。

def sigmoid_output_to_derivative(output):
  return output*(1-output)
a=0.1
X = np.array([[0,0],
[0,1],
[1,0],
[1,1]])
np.random.seed(1)

y = np.array([[0],
[1],
[1],
[0]])
bias = np.ones(4)
X = np.c_[bias, X]

synapse_0 = 2*np.random.random((3,1)) - 1
synapse_1 = 2*np.random.random((1,1)) - 1

for j in (0,600000):

  layer_0 = X
  layer_1 = sigmoid(np.dot(layer_0,synapse_0))
  layer_2 = sigmoid(np.dot(layer_1,synapse_1))
  layer_2_error = layer_2 - y

  if (j% 10000) == 0:
    print( "Error after "+str(j)+" iterations:" + str(np.mean(np.abs(layer_2_error))))

  layer_2_delta = layer_2_error*sigmoid_output_to_derivative(layer_2)

  layer_1_error = layer_2_delta.dot(synapse_1.T)

  layer_1_delta = layer_1_error * sigmoid_output_to_derivative(layer_1)

  synapse_1 -= a *(layer_1.T.dot(layer_2_delta))
  synapse_0 -= a *(layer_0.T.dot(layer_1_delta))

You need to be careful with statements like 您需要注意以下语句

the learning rate is bad 学习率不好

Usually the learning rate is the step size that gradient descent takes in negative gradient direction. 通常,学习速率是梯度下降沿负梯度方向采取的步长。 So, I'm not sure what you mean by a bad learning rate. 因此,我不确定学习率低是什么意思。

I'm also not sure if I understand your code correctly, but the forward step of a neural net is basically a matrix multiplication of the weight matrix for the hidden layer times the input vector. 我也不确定我是否正确理解了您的代码,但是神经网络的前进步骤基本上是隐藏层的权重矩阵乘以输入向量的矩阵乘法。 This will (if you set up everything correctly) result in a matrix which is equal to the size of your hidden layer. (如果正确设置所有内容)这将导致矩阵等于隐藏层的大小。 Now, you can simply add the bias before applying your logistic function elementwise to this matrix. 现在,您可以在将逻辑函数逐元素应用于此矩阵之前简单地添加偏差。

h_i = f(h_i+bias_in)

Afterwards you can do the same thing for the hidden layer times the output weights and apply its activation to get the outputs. 然后,您可以对隐藏层乘以输出权重来执行相同的操作,并对其激活以获取输出。

o_j = f(o_j+bias_h)

The backwards step is to calculate the deltas at output and hidden layer including another elementwise operation with your function 向后的步骤是计算输出层和隐藏层的增量,包括使用函数进行的另一个元素操作

sigmoid_output_to_derivative(output) sigmoid_output_to_derivative(输出)

and update both weight matrices using the gradients (here the learning rate is needed to define the step size). 并使用梯度更新两个权重矩阵(此处需要学习率来定义步长)。 The gradients are simply the value of a corresponding node times its delta. 梯度只是相应节点的值乘以其增量。 Note: The deltas are differently calculated for output and hidden nodes. 注意:对于输出节点和隐藏节点,增量的计算方式有所不同。

I'd advice you to keep separate variables for the biases. 我建议您为偏差保留单独的变量。 Because modern approaches usually update those by summing up the deltas of its connected notes times a different learning rate and subtract this product from the specific bias. 因为现代方法通常通过将其连接音符的增量相加来乘以不同的学习率来更新这些值,然后从特定偏差中减去该乘积。

Take a look at the following tutorial (it uses numpy): 看一下以下教程(它使用numpy):

http://peterroelants.github.io/posts/neural_network_implementation_part04/ http://peterroelants.github.io/posts/neural_network_implementation_part04/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM