简体   繁体   中英

Neural Network - Basic Python

I am using the following tutorial for developing a basic neural network that does feedforward and backdrop. The link to the tutorial is here: Python Neural Network Tutorial

import numpy as np

def sigmoid(x):
    return 1.0/(1+ np.exp(-x))

def sigmoid_derivative(x):
    return x * (1.0 - x)

class NeuralNetwork:
    def __init__(self, x, y):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],4) 
        self.weights2   = np.random.rand(4,1)                 
        self.y          = y
        self.output     = np.zeros(self.y.shape)

    def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))

    def backprop(self):
        # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
        d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)))
        d_weights1 = np.dot(self.input.T,  (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))

        # update the weights with the derivative (slope) of the loss function
        self.weights1 += d_weights1
        self.weights2 += d_weights2


if __name__ == "__main__":
    X = np.array([[0,0,1],
                  [0,1,1],
                  [1,0,1],
                  [1,1,1]])
    y = np.array([[0],[1],[1],[0]])
    nn = NeuralNetwork(X,y)

    for i in range(1500):
        nn.feedforward()
        nn.backprop()

    print(nn.output)

What im trying to do is change the data set and return 1 if the predicted number is even and 0 if the same is odd. So I made the following changes:

if __name__ == "__main__":
    X = np.array([[2,4,6,8,10],
                  [1,3,5,7,9],
                  [11,13,15,17,19],
                  [22,24,26,28,30]])
    y = np.array([[1],[0],[0],[1]])
    nn = NeuralNetwork(X,y)

The output I get is :
[[0.50000001]
 [0.50000002]
 [0.50000001]
 [0.50000001]]

What am I doing wrong?

Basically there are two problems here:

  1. Your expression of sigmoid_derivative is wrong, it should be:

    return sigmoid(x)*((1.0 - sigmoid(x)))

  2. If you take a look at the sigmoid function plot or your network weights, you would find out that your network saturated due to your large input. By doing something like X=X%5 you can get the training result you want, as the result of mine on your data:

    [[9.99626174e-01] [3.55126310e-04] [3.55126310e-04] [9.99626174e-01]]

sigmoid 图

Just add X = X/30 and train the network 10 times longer. This converged for me. You divide X by 30 to make every input in between 0 and 1. You train it longer because it is a more complex dataset.

Your derivative is fine because when you use the derivative function, the input to it is already sigmoid(x) . So x*(1-x) is sigmoid(x)*(1-sigmoid(x))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM