我的神经网络只预测一件事

Question

I am new to deep neural networks and trying to implement it on python from scratch.我是深度神经网络的新手，并试图从头开始在 python 上实现它。 I tried a lot but couldn't find the bug in my implementation.我尝试了很多，但在我的实现中找不到错误。 Whenever I use my 'predict' function it always outputs, 0. I have also tested each and every function that you'll see in my code provided below using random arrays of the same shape as x and y(you'll see below) and all of them seems to work perfectly.每当我使用我的“预测”function 它总是输出 0所有这些似乎都完美无缺。 I also have previously cleaned the data.我之前也清理过数据。

import os
os.chdir(r'path where my data is store')#This block of code changes directory to where my data set

Creating a dataframe and assigning values to the input and target vector创建 dataframe 并将值分配给输入和目标向量

import pandas as pd
import numpy as np
df = pd.read_csv('clean_data.csv')
X = df[['radius_mean', 'texture_mean', 'perimeter_mean',
   'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean',
   'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean',
   'radius_se', 'texture_se', 'perimeter_se', 'area_se', 'smoothness_se',
   'compactness_se', 'concavity_se', 'concave points_se', 'symmetry_se',
   'fractal_dimension_se', 'radius_worst', 'texture_worst',
   'perimeter_worst', 'area_worst', 'smoothness_worst',
   'compactness_worst', 'concavity_worst', 'concave points_worst',
   'symmetry_worst', 'fractal_dimension_worst']].values
Y = df['diagnosis'].values 
Y = Y.reshape(569,1)

Splitting the data in training and testing data(x and y are training set and xt and yt are test set)在训练和测试数据中拆分数据（x 和 y 是训练集，xt 和 yt 是测试集）

from sklearn.model_selection import train_test_split
x, xt, y, yt = train_test_split(X, Y, test_size = 0.2, random_state = 40)
x, xt, y, yt = x.T, xt.T, y.T, yt.T

initializing parameters初始化参数

def iniparams(layer_dims):
params = {}
for l in range(1,len(layer_dims)):
    params['W' + str(l)] = np.random.randn(layer_dims[l],layer_dims[l - 1])*0.01
    params['b' + str(l)] = np.zeros((layer_dims[l],1))
return params

Writing helper functions #1编写辅助函数 #1

def sigmoid(Z):
return 1/(1 + np.exp(-Z)), Z

#2 #2

def relu(Z):
return np.maximum(0, Z), Z

Linear forward线性前进

def linearfwd(W, A, b):
Z = np.dot(W, A) + b
linear_cache = (W, A, b)
return Z, linear_cache

Forward activation前向激活

def fwdactivation(W, A_prev, b, activation):
if activation == 'sigmoid':
    Z, linear_cache = linearfwd(W, A_prev, b)
    A, activation_cache = sigmoid(Z)
elif activation == 'relu':
    Z, linear_cache = linearfwd(W, A_prev, b)
    A, activation_cache = relu(Z)
cache = (linear_cache, activation_cache)
return A, cache

Forward model前进model

def fwdmodel(x, params):
caches = []
L = len(params)//2
A = x
for l in range(1, L):
    A_prev = A
    A, cache = fwdactivation(params['W' + str(l)], A_prev, params['b' + str(l)], 'relu')
    caches.append(cache)
AL, cache = fwdactivation(params['W' + str(L)], A, params['b' + str(L)], 'sigmoid')
caches.append(cache)
return AL, caches

Computing cost计算成本

def J(AL, y):
return -np.sum(np.multiply(np.log(AL), y) + np.multiply(np.log(1 - AL), (1 - y)))/y.shape[1]

backward sigmoid后向乙状结肠

def sigmoidbkwd(dA, cache):
Z = cache
s = 1/(1 + np.exp(-Z))
dZ = dA*s*(1 - s)
return dZ

backward relu`后向relu`

def sigmoidbkwd(dA, cache):
Z = cache
s = 1/(1 + np.exp(-Z))
dZ = dA*s*(1 - s)
return dZ

linear bkwd线性 bkwd

def linearbkwd(dZ, cache):
W, A_prev, b = cache
m = A_prev.shape[1]
dW = np.dot(dZ, A_prev.T)/m
db = np.sum(dZ, axis = 1, keepdims = True)/m
dA_prev = np.dot(W.T, dZ)
return dW, dA_prev, db

backward activation后向激活

def bkwdactivation(dA, cache, activation):
linear_cache, activation_cache = cache
if activation == 'sigmoid':
    dZ = sigmoidbkwd(dA, activation_cache)
    dW, dA_prev, db = linearbkwd(dZ, linear_cache)
if activation == 'relu':
    dZ = relubkwd(dA, activation_cache)
    dW, dA_prev, db = linearbkwd(dZ, linear_cache)
return dW, dA_prev, db

backward model向后 model

def bkwdmodel(AL, y, cache):
grads = {}
L = len(cache)
dAL = -(np.divide(y, AL) - np.divide(1 - y,1 - AL))
current_cache = cache[L - 1]
grads['dW' + str(L)], grads['dA' + str(L - 1)], grads['db' + str(L)] = bkwdactivation(dAL, current_cache, 'sigmoid')
for l in reversed(range(L - 1)):
    current_cache = cache[l]
    dW_temp, dA_prev_temp, db_temp = bkwdactivation(grads['dA' + str(l + 1)], current_cache, 'relu')
    grads['dW' + str(l + 1)] = dW_temp
    grads['dA' + str(l)] = dA_prev_temp
    grads['db' + str(l + 1)] = db_temp
return grads

Optimizing parameters using gradient descent使用梯度下降优化参数

def optimize(grads, params, alpha):
L = len(params)//2
for l in range(1, L + 1):
    params['W' + str(l)] = params['W' + str(l)] - alpha*grads['dW' + str(l)]
    params['b' + str(l)] = params['b' + str(l)] - alpha*grads['db' + str(l)]
return params

Neural Network Model神经网络 Model

def model(x, y, layer_dims, iters):
costs = []
params = iniparams(layer_dims)
for i in range(1, iters):
    AL, caches = fwdmodel(x, params)
    cost = J(AL, y)
    costs.append(cost)
    grads = bkwdmodel(AL, y, caches)
    params = optimize(grads, params, 1.2)
    if i%100 == 0:
        print('Cost after', i,'iterations is:', cost)
        costs.append(cost)
return costs, params

calculation (The cost does gets mitigated Cost Vs Iterations(Y,X) curve )计算（成本确实得到缓解Cost Vs Iterations(Y,X) 曲线）

costs, params = model(x, y, [30,8,5,4,4,3,1], 3000)

Prediction function预测 function

def predict(x,params):

AL, cache = fwdmodel(x,params)
predictions = AL >= 0.5

return predictions

And finally when I do this最后当我这样做时

predictions = predict(xt,params)
predictions

I get this:我明白了：

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])数组（[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Please tell me where I am wrong请告诉我我错在哪里

here's the link to the dataset这是数据集的链接

Please help me out:D请帮帮我：D

Answer 1

I don't'see why you've transposed your train-test-split output.我不明白您为什么要调换您的火车测试拆分 output。 Why use xt.T, xT anyway?为什么要使用 xt.T、xT 呢？ You should try printing your params(array) output and xt(array) output and see how they are.您应该尝试打印您的 params(array) output 和 xt(array) output 并查看它们的情况。 Are they similar?它们相似吗？ Does your params output give the right result?您的参数 output 是否给出正确的结果？ Check all of that.检查所有这些。

Answer 2

The problem with me was that my Neural Network was too deep.我的问题是我的神经网络太深了。 It's a mistake that newbies like me tend to make.这是像我这样的新手容易犯的错误。 I found this awesome resource that helped me realize this mistake: http://theorangeduck.com/page/neural-network-not-working我发现这个很棒的资源帮助我意识到了这个错误： http://theorangeduck.com/page/neural-network-not-working

我的神经网络只预测一件事

问题描述

2 个解决方案

解决方案1
0 2020-07-03 13:32:15

解决方案2
0 已采纳 2020-07-05 05:45:23

我的神经网络只预测一件事

问题描述

2 个解决方案

解决方案1 0 2020-07-03 13:32:15

解决方案2 0 已采纳 2020-07-05 05:45:23

解决方案1
0 2020-07-03 13:32:15

解决方案2
0 已采纳 2020-07-05 05:45:23