使用NumPy从零开始的简单神经网络

Question

I added learning rate and momentum to a neural network implementation from scratch I found at: https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6 我从零开始就在神经网络实现中增加了学习速度和动量： https ： //towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6

However I had a few questions about my implementation: 但是我对实现有一些疑问：

Is it correct? 这是正确的吗？ Any suggested improvements? 有任何建议的改进吗？ It appears to output adequate results generally but outside advice is very appreciated. 它看起来通常可以输出足够的结果，但是外界的建议还是值得赞赏的。

With a learning rate < 0.5 or momentum > 0.9 the network tends to gets stuck in a local optimum where loss = ~1. 在学习率<0.5或动量> 0.9的情况下，网络往往会陷入损失=〜1的局部最优状态。 I assume this is because step size isn't big enough to escape this but is there a way to overcome this? 我认为这是因为步长不足以逃避此问题，但是有办法克服吗？ Or is this inherent with the nature of the data being solved and unavoidable. 或者，这是要解决且不可避免的数据性质所固有的。

 import numpy as np import matplotlib.pyplot as plt def sigmoid(x): return 1 / (1 + np.exp(-x)) def sigmoid_derivative(x): sig = 1 / (1 + np.exp(-x)) return sig * (1 - sig) class NeuralNetwork: def __init__(self, x, y): self.input = x self.weights1 = np.random.rand(self.input.shape[1], 4) self.weights2 = np.random.rand(4, 1) self.y = y self.output = np.zeros(self.y.shape) self.v_dw1 = 0 self.v_dw2 = 0 self.alpha = 0.5 self.beta = 0.5 def feedforward(self): self.layer1 = sigmoid(np.dot(self.input, self.weights1)) self.output = sigmoid(np.dot(self.layer1, self.weights2)) def backprop(self, alpha, beta): # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1 d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output))) d_weights1 = np.dot(self.input.T, (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1))) # adding effect of momentum self.v_dw1 = (beta * self.v_dw1) + ((1 - beta) * d_weights1) self.v_dw2 = (beta * self.v_dw2) + ((1 - beta) * d_weights2) # update the weights with the derivative (slope) of the loss function self.weights1 = self.weights1 + (self.v_dw1 * alpha) self.weights2 = self.weights2 + (self.v_dw2 * alpha) if __name__ == "__main__": X = np.array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]]) y = np.array([[0], [1], [1], [0]]) nn = NeuralNetwork(X, y) total_loss = [] for i in range(10000): nn.feedforward() nn.backprop(nn.alpha, nn.beta) total_loss.append(sum((nn.y-nn.output)**2)) iteration_num = list(range(10000)) plt.plot(iteration_num, total_loss) plt.show() print(nn.output)

Answer 1

first thing, in your "sigmoid_derivative(x)", input to this function is already output of a sigmoid, but you get the sigmoid again and then computed derivative, that is one problem, it should be : 首先，在您的“ sigmoid_derivative（x）”中，此函数的输入已经是Sigmoid的输出，但是您又得到了Sigmoid，然后计算了导数，这是一个问题，应该是：

return x * (1 - x)

second problem, you are not using any bias, how do you know your decision boundary would cross the origin in the problem hypothesis space? 第二个问题，您没有使用任何偏见，您如何知道您的决策边界将跨越问题假设空间中的原点？ so you need to add a bias term. 因此您需要添加一个偏差项。

And last thing I think your derivatives are not correct, you can refer to Andrew Ng deep learning course 1, week 2 at coursera.org , for a list of general formulas for computing back propagation in neural networks to make sure you are doing it right. 最后一件我认为您的导数不正确的问题，您可以参考课程2的第2周的Andrew Ng深度学习课程1，在课程ra.org上找到用于在神经网络中计算反向传播的通用公式列表，以确保您做得正确。

使用NumPy从零开始的简单神经网络

问题描述

1 个解决方案

解决方案1
0 2018-10-25 06:05:29

使用NumPy从零开始的简单神经网络

问题描述

1 个解决方案

解决方案1 0 2018-10-25 06:05:29

解决方案1
0 2018-10-25 06:05:29