简体   繁体   中英

Need help in debugging Shallow Neural network using numpy

I'm doing a hands-on for learning and have created a model in python using numpy that's being trained on breast cancer dataSet from sklearn library. Model is running without any error and giving me Train and Test accuracy as 92.48826291079813% and 90.9090909090909% respectively. However somehow I'm not able to complete the hands-on since (probably) my result is different than expected. I don't know where the problem is because I don't know the right answer, also don't see any error.

Would request someone to help me with this. Code is given below.

#Import numpy as np and pandas as pd
"""

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_breast_cancer

**Define method initialiseNetwork() initilise weights with zeros of shape(num_features, 1) and also bias b to zero
parameters: num_features(number of input features)
returns : dictionary of weight vector and bias**


def initialiseNetwork(num_features):
  W = np.zeros((num_features,1))
  b = 0
  parameters = {"W": W, "b": b}
  return parameters

** define function sigmoid for the input z.  
parameters: z
returns: $1/(1+e^{(-z)})$ **


def sigmoid(z):
  a = 1/(1 + np.exp(-z))
  return a

** Define method forwardPropagation() which implements forward propagtion defined as Z = (W.T dot_product X) + b, A = sigmoid(Z)
parameters: X, parameters
returns: A **


def forwardPropagation(X, parameters):
  W = parameters["W"]
  b = parameters["b"]
  Z = np.dot(W.T,X) + b
  A = sigmoid(Z)
  return A

** Define function cost() which calculate the cost given by  −(sum(Y\*log(A)+(1−Y)\*log(1−A)))/num_samples, here * is elementwise product
parameters: A,Y,num_samples(number of samples)
returns: cost **    

def cost(A, Y, num_samples):
  cost = -1/num_samples * np.sum(Y*np.log(A) + (1-Y)*(np.log(1-A)))
  #cost = Y*np.log(A) + (1-Y)*(np.log(1-A))
  return cost

** Define method backPropgation() to get the derivatives of weigths and bias
parameters: X,Y,A,num_samples
returns: dW,db **


def backPropagration(X, Y, A, num_samples):
  dZ =  A - Y                        
  dW =  (np.dot(X,dZ.T))/num_samples                              #(X dot_product dZ.T)/num_samples
  db =  np.sum(dZ)/num_samples                             #sum(dZ)/num_samples
  return dW, db

** Define function updateParameters() to update current parameters with its derivatives  
w = w - learning_rate \* dw  
b = b - learning_rate \* db  
parameters: parameters,dW,db, learning_rate   
returns: dictionary of updated parameters **    

def updateParameters(parameters, dW, db, learning_rate):
  W = parameters["W"] - (learning_rate * dW)
  b = parameters["b"] - (learning_rate * db)
  return {"W": W, "b": b}

** Define the model for forward propagation  
parameters: X,Y, num_iter(number of iterations), learning_rate
returns: parameters(dictionary of updated weights and bias) **


def model(X, Y, num_iter, learning_rate):
  num_features = X.shape[0]
  num_samples = X.shape[1]
  parameters = initialiseNetwork(num_features)                     #call initialiseNetwork()
  for i in range(num_iter):
    #A = forwardPropagation(X, Y, parameters)                       # calculate final output A from forwardPropagation()
    A = forwardPropagation(X, parameters)
    if(i%100 == 0):
      print("cost after {} iteration: {}".format(i, cost(A, Y, num_samples)))
    dW, db = backPropagration(X, Y, A, num_samples)                # calculate  derivatives from backpropagation
    parameters = updateParameters(parameters, dW, db, learning_rate) # update parameters
  return parameters

** Run the below cell to define the function to predict the output.It takes updated parameters and input data as function parameters and returns the predicted output **

def predict(X, parameters):
  W = parameters["W"]
  b = parameters["b"]
  b = b.reshape(b.shape[0],1)
  Z = np.dot(W.T,X) + b
  Y = np.array([1 if y > 0.5 else 0 for y in sigmoid(Z[0])]).reshape(1,len(Z[0]))
  return Y

** The code in the below cell loads the breast cancer data set from sklearn.
The input variable(X_cancer) is about the dimensions of tumor cell and targrt variable(y_cancer) classifies tumor as malignant(0) or benign(1) **


(X_cancer, y_cancer) = load_breast_cancer(return_X_y = True)

** Split the data into train and test set using train_test_split(). Set the random state to 25. Refer the code snippet in topic 4 **

X_train, X_test, y_train, y_test = train_test_split(X_cancer, y_cancer,
                                                   random_state = 25)

** Since the dimensions of tumor is not uniform you need to normalize the data before feeding to the network
The below function is used to normalize the input data. **

def normalize(data):
  col_max = np.max(data, axis = 0)
  col_min = np.min(data, axis = 0)
  return np.divide(data - col_min, col_max - col_min)

** Normalize X_train and X_test and assign it to X_train_n and X_test_n respectively **

X_train_n = normalize(X_train)
X_test_n = normalize(X_test)

** Transpose X_train_n and X_test_n so that rows represents features and column represents the samples
Reshape Y_train and y_test into row vector whose length is equal to number of samples.Use np.reshape() **


X_trainT = X_train_n.T
#print(X_trainT.shape)
X_testT = X_test_n.T
#print(X_testT.shape)
y_trainT = y_train.reshape(1,X_trainT.shape[1])
y_testT = y_test.reshape(1,X_testT.shape[1])

** Train the network using X_trainT,y_trainT with number of iterations 4000 and learning rate 0.75 **

parameters = model(X_trainT, y_trainT, 4000, 0.75)    #call the model() function with parametrs mentioned in the above cell

** Predict the output of test and train data using X_trainT and X_testT using predict() method> Use the parametes returned from the trained model **

yPredTrain = predict(X_trainT, parameters)   # pass weigths and bias from parameters dictionary and X_trainT as input to the function
yPredTest = predict(X_testT, parameters)    # pass the same parameters but X_testT as input data

** Run the below cell print the accuracy of model on train and test data. ***

accuracy_train = 100 - np.mean(np.abs(yPredTrain - y_trainT)) * 100
accuracy_test = 100 - np.mean(np.abs(yPredTest - y_testT)) * 100
print("train accuracy: {} %".format(accuracy_train))
print("test accuracy: {} %".format(accuracy_test))

My Output: train accuracy: 92.48826291079813 % test accuracy: 90.9090909090909 %

I figured out where the problem was. It was the third line in predict function where I was reshaping bias which was not at all necessary.

def predict(X, parameters):
  W = parameters["W"]
  b = parameters["b"]
  **b = b.reshape(b.shape[0],1)**
  Z = np.dot(W.T,X) + b
  Y = np.array([1 if y > 0.5 else 0 for y in sigmoid(Z[0])]).reshape(1,len(Z[0]))
  return Y

and third line in back-propagation function needed to be corrected as np.sum(dZ)/num_samples.

def backPropagration(X, Y, A, num_samples):
  dZ =  A - Y                        
  dW =  (np.dot(X,dZ.T))/num_samples                              
  ** db =  sum(dZ)/num_samples **
  return dW, db

After I corrected both functions, the model gave me train accuracy as 98.59154929577464% and test accuracy as 93.00699300699301%.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM