简体   繁体   English

我的神经网络只预测一件事

[英]My neural network only predicts one thing

I am new to deep neural networks and trying to implement it on python from scratch.我是深度神经网络的新手,并试图从头开始在 python 上实现它。 I tried a lot but couldn't find the bug in my implementation.我尝试了很多,但在我的实现中找不到错误。 Whenever I use my 'predict' function it always outputs, 0. I have also tested each and every function that you'll see in my code provided below using random arrays of the same shape as x and y(you'll see below) and all of them seems to work perfectly.每当我使用我的“预测”function 它总是输出 0所有这些似乎都完美无缺。 I also have previously cleaned the data.我之前也清理过数据。

import os
os.chdir(r'path where my data is store')#This block of code changes directory to where my data set

Creating a dataframe and assigning values to the input and target vector创建 dataframe 并将值分配给输入和目标向量

import pandas as pd
import numpy as np
df = pd.read_csv('clean_data.csv')
X = df[['radius_mean', 'texture_mean', 'perimeter_mean',
   'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean',
   'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean',
   'radius_se', 'texture_se', 'perimeter_se', 'area_se', 'smoothness_se',
   'compactness_se', 'concavity_se', 'concave points_se', 'symmetry_se',
   'fractal_dimension_se', 'radius_worst', 'texture_worst',
   'perimeter_worst', 'area_worst', 'smoothness_worst',
   'compactness_worst', 'concavity_worst', 'concave points_worst',
   'symmetry_worst', 'fractal_dimension_worst']].values
Y = df['diagnosis'].values 
Y = Y.reshape(569,1)

Splitting the data in training and testing data(x and y are training set and xt and yt are test set)在训练和测试数据中拆分数据(x 和 y 是训练集,xt 和 yt 是测试集)

from sklearn.model_selection import train_test_split
x, xt, y, yt = train_test_split(X, Y, test_size = 0.2, random_state = 40)
x, xt, y, yt = x.T, xt.T, y.T, yt.T

initializing parameters初始化参数

def iniparams(layer_dims):
params = {}
for l in range(1,len(layer_dims)):
    params['W' + str(l)] = np.random.randn(layer_dims[l],layer_dims[l - 1])*0.01
    params['b' + str(l)] = np.zeros((layer_dims[l],1))
return params

Writing helper functions #1编写辅助函数 #1

def sigmoid(Z):
return 1/(1 + np.exp(-Z)), Z

#2 #2

def relu(Z):
return np.maximum(0, Z), Z

Linear forward线性前进

def linearfwd(W, A, b):
Z = np.dot(W, A) + b
linear_cache = (W, A, b)
return Z, linear_cache

Forward activation前向激活

def fwdactivation(W, A_prev, b, activation):
if activation == 'sigmoid':
    Z, linear_cache = linearfwd(W, A_prev, b)
    A, activation_cache = sigmoid(Z)
elif activation == 'relu':
    Z, linear_cache = linearfwd(W, A_prev, b)
    A, activation_cache = relu(Z)
cache = (linear_cache, activation_cache)
return A, cache

Forward model前进model

def fwdmodel(x, params):
caches = []
L = len(params)//2
A = x
for l in range(1, L):
    A_prev = A
    A, cache = fwdactivation(params['W' + str(l)], A_prev, params['b' + str(l)], 'relu')
    caches.append(cache)
AL, cache = fwdactivation(params['W' + str(L)], A, params['b' + str(L)], 'sigmoid')
caches.append(cache)
return AL, caches

Computing cost计算成本

def J(AL, y):
return -np.sum(np.multiply(np.log(AL), y) + np.multiply(np.log(1 - AL), (1 - y)))/y.shape[1]

backward sigmoid后向乙状结肠

def sigmoidbkwd(dA, cache):
Z = cache
s = 1/(1 + np.exp(-Z))
dZ = dA*s*(1 - s)
return dZ

backward relu`后向relu`

def sigmoidbkwd(dA, cache):
Z = cache
s = 1/(1 + np.exp(-Z))
dZ = dA*s*(1 - s)
return dZ

linear bkwd线性 bkwd

def linearbkwd(dZ, cache):
W, A_prev, b = cache
m = A_prev.shape[1]
dW = np.dot(dZ, A_prev.T)/m
db = np.sum(dZ, axis = 1, keepdims = True)/m
dA_prev = np.dot(W.T, dZ)
return dW, dA_prev, db

backward activation后向激活

def bkwdactivation(dA, cache, activation):
linear_cache, activation_cache = cache
if activation == 'sigmoid':
    dZ = sigmoidbkwd(dA, activation_cache)
    dW, dA_prev, db = linearbkwd(dZ, linear_cache)
if activation == 'relu':
    dZ = relubkwd(dA, activation_cache)
    dW, dA_prev, db = linearbkwd(dZ, linear_cache)
return dW, dA_prev, db

backward model向后 model

def bkwdmodel(AL, y, cache):
grads = {}
L = len(cache)
dAL = -(np.divide(y, AL) - np.divide(1 - y,1 - AL))
current_cache = cache[L - 1]
grads['dW' + str(L)], grads['dA' + str(L - 1)], grads['db' + str(L)] = bkwdactivation(dAL, current_cache, 'sigmoid')
for l in reversed(range(L - 1)):
    current_cache = cache[l]
    dW_temp, dA_prev_temp, db_temp = bkwdactivation(grads['dA' + str(l + 1)], current_cache, 'relu')
    grads['dW' + str(l + 1)] = dW_temp
    grads['dA' + str(l)] = dA_prev_temp
    grads['db' + str(l + 1)] = db_temp
return grads

Optimizing parameters using gradient descent使用梯度下降优化参数

def optimize(grads, params, alpha):
L = len(params)//2
for l in range(1, L + 1):
    params['W' + str(l)] = params['W' + str(l)] - alpha*grads['dW' + str(l)]
    params['b' + str(l)] = params['b' + str(l)] - alpha*grads['db' + str(l)]
return params

Neural Network Model神经网络 Model

def model(x, y, layer_dims, iters):
costs = []
params = iniparams(layer_dims)
for i in range(1, iters):
    AL, caches = fwdmodel(x, params)
    cost = J(AL, y)
    costs.append(cost)
    grads = bkwdmodel(AL, y, caches)
    params = optimize(grads, params, 1.2)
    if i%100 == 0:
        print('Cost after', i,'iterations is:', cost)
        costs.append(cost)
return costs, params

calculation (The cost does gets mitigated Cost Vs Iterations(Y,X) curve )计算(成本确实得到缓解Cost Vs Iterations(Y,X) 曲线

costs, params = model(x, y, [30,8,5,4,4,3,1], 3000)

Prediction function预测 function

def predict(x,params):

AL, cache = fwdmodel(x,params)
predictions = AL >= 0.5

return predictions

And finally when I do this最后当我这样做时

predictions = predict(xt,params)
predictions

I get this:我明白了:

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])数组([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Please tell me where I am wrong请告诉我我错在哪里

here's the link to the dataset这是数据集的链接

Please help me out:D请帮帮我:D

I don't'see why you've transposed your train-test-split output.我不明白您为什么要调换您的火车测试拆分 output。 Why use xt.T, xT anyway?为什么要使用 xt.T、xT 呢? You should try printing your params(array) output and xt(array) output and see how they are.您应该尝试打印您的 params(array) output 和 xt(array) output 并查看它们的情况。 Are they similar?它们相似吗? Does your params output give the right result?您的参数 output 是否给出正确的结果? Check all of that.检查所有这些。

The problem with me was that my Neural Network was too deep.我的问题是我的神经网络太深了。 It's a mistake that newbies like me tend to make.这是像我这样的新手容易犯的错误。 I found this awesome resource that helped me realize this mistake: http://theorangeduck.com/page/neural-network-not-working我发现这个很棒的资源帮助我意识到了这个错误: http://theorangeduck.com/page/neural-network-not-working

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM