简体   繁体   English

使用NumPy从零开始的简单神经网络

[英]Simple Neural Network from scratch using NumPy

I added learning rate and momentum to a neural network implementation from scratch I found at: https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6 我从零开始就在神经网络实现中增加了学习速度和动量: https//towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6

However I had a few questions about my implementation: 但是我对实现有一些疑问:

  • Is it correct? 这是正确的吗? Any suggested improvements? 有任何建议的改进吗? It appears to output adequate results generally but outside advice is very appreciated. 它看起来通常可以输出足够的结果,但是外界的建议还是值得赞赏的。
  • With a learning rate < 0.5 or momentum > 0.9 the network tends to gets stuck in a local optimum where loss = ~1. 在学习率<0.5或动量> 0.9的情况下,网络往往会陷入损失=〜1的局部最优状态。 I assume this is because step size isn't big enough to escape this but is there a way to overcome this? 我认为这是因为步长不足以逃避此问题,但是有办法克服吗? Or is this inherent with the nature of the data being solved and unavoidable. 或者,这是要解决且不可避免的数据性质所固有的。

     import numpy as np import matplotlib.pyplot as plt def sigmoid(x): return 1 / (1 + np.exp(-x)) def sigmoid_derivative(x): sig = 1 / (1 + np.exp(-x)) return sig * (1 - sig) class NeuralNetwork: def __init__(self, x, y): self.input = x self.weights1 = np.random.rand(self.input.shape[1], 4) self.weights2 = np.random.rand(4, 1) self.y = y self.output = np.zeros(self.y.shape) self.v_dw1 = 0 self.v_dw2 = 0 self.alpha = 0.5 self.beta = 0.5 def feedforward(self): self.layer1 = sigmoid(np.dot(self.input, self.weights1)) self.output = sigmoid(np.dot(self.layer1, self.weights2)) def backprop(self, alpha, beta): # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1 d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output))) d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1))) # adding effect of momentum self.v_dw1 = (beta * self.v_dw1) + ((1 - beta) * d_weights1) self.v_dw2 = (beta * self.v_dw2) + ((1 - beta) * d_weights2) # update the weights with the derivative (slope) of the loss function self.weights1 = self.weights1 + (self.v_dw1 * alpha) self.weights2 = self.weights2 + (self.v_dw2 * alpha) if __name__ == "__main__": X = np.array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]]) y = np.array([[0], [1], [1], [0]]) nn = NeuralNetwork(X, y) total_loss = [] for i in range(10000): nn.feedforward() nn.backprop(nn.alpha, nn.beta) total_loss.append(sum((nn.y-nn.output)**2)) iteration_num = list(range(10000)) plt.plot(iteration_num, total_loss) plt.show() print(nn.output) 

first thing, in your "sigmoid_derivative(x)", input to this function is already output of a sigmoid, but you get the sigmoid again and then computed derivative, that is one problem, it should be : 首先,在您的“ sigmoid_derivative(x)”中,此函数的输入已经是Sigmoid的输出,但是您又得到了Sigmoid,然后计算了导数,这是一个问题,应该是:

return x * (1 - x)

second problem, you are not using any bias, how do you know your decision boundary would cross the origin in the problem hypothesis space? 第二个问题,您没有使用任何偏见,您如何知道您的决策边界将跨越问题假设空间中的原点? so you need to add a bias term. 因此您需要添加一个偏差项。

And last thing I think your derivatives are not correct, you can refer to Andrew Ng deep learning course 1, week 2 at coursera.org , for a list of general formulas for computing back propagation in neural networks to make sure you are doing it right. 最后一件我认为您的导数不正确的问题,您可以参考课程2的第2周的Andrew Ng深度学习课程1,在课程ra.org上找到用于在神经网络中计算反向传播的通用公式列表,以确保您做得正确。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM