[英]My neural network only predicts one thing
I am new to deep neural networks and trying to implement it on python from scratch.我是深度神经网络的新手,并试图从头开始在 python 上实现它。 I tried a lot but couldn't find the bug in my implementation.
我尝试了很多,但在我的实现中找不到错误。 Whenever I use my 'predict' function it always outputs, 0. I have also tested each and every function that you'll see in my code provided below using random arrays of the same shape as x and y(you'll see below) and all of them seems to work perfectly.
每当我使用我的“预测”function 它总是输出 0所有这些似乎都完美无缺。 I also have previously cleaned the data.
我之前也清理过数据。
import os
os.chdir(r'path where my data is store')#This block of code changes directory to where my data set
Creating a dataframe and assigning values to the input and target vector创建 dataframe 并将值分配给输入和目标向量
import pandas as pd
import numpy as np
df = pd.read_csv('clean_data.csv')
X = df[['radius_mean', 'texture_mean', 'perimeter_mean',
'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean',
'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean',
'radius_se', 'texture_se', 'perimeter_se', 'area_se', 'smoothness_se',
'compactness_se', 'concavity_se', 'concave points_se', 'symmetry_se',
'fractal_dimension_se', 'radius_worst', 'texture_worst',
'perimeter_worst', 'area_worst', 'smoothness_worst',
'compactness_worst', 'concavity_worst', 'concave points_worst',
'symmetry_worst', 'fractal_dimension_worst']].values
Y = df['diagnosis'].values
Y = Y.reshape(569,1)
Splitting the data in training and testing data(x and y are training set and xt and yt are test set)在训练和测试数据中拆分数据(x 和 y 是训练集,xt 和 yt 是测试集)
from sklearn.model_selection import train_test_split
x, xt, y, yt = train_test_split(X, Y, test_size = 0.2, random_state = 40)
x, xt, y, yt = x.T, xt.T, y.T, yt.T
initializing parameters初始化参数
def iniparams(layer_dims):
params = {}
for l in range(1,len(layer_dims)):
params['W' + str(l)] = np.random.randn(layer_dims[l],layer_dims[l - 1])*0.01
params['b' + str(l)] = np.zeros((layer_dims[l],1))
return params
Writing helper functions #1编写辅助函数 #1
def sigmoid(Z):
return 1/(1 + np.exp(-Z)), Z
#2 #2
def relu(Z):
return np.maximum(0, Z), Z
Linear forward线性前进
def linearfwd(W, A, b):
Z = np.dot(W, A) + b
linear_cache = (W, A, b)
return Z, linear_cache
Forward activation前向激活
def fwdactivation(W, A_prev, b, activation):
if activation == 'sigmoid':
Z, linear_cache = linearfwd(W, A_prev, b)
A, activation_cache = sigmoid(Z)
elif activation == 'relu':
Z, linear_cache = linearfwd(W, A_prev, b)
A, activation_cache = relu(Z)
cache = (linear_cache, activation_cache)
return A, cache
Forward model前进model
def fwdmodel(x, params):
caches = []
L = len(params)//2
A = x
for l in range(1, L):
A_prev = A
A, cache = fwdactivation(params['W' + str(l)], A_prev, params['b' + str(l)], 'relu')
caches.append(cache)
AL, cache = fwdactivation(params['W' + str(L)], A, params['b' + str(L)], 'sigmoid')
caches.append(cache)
return AL, caches
Computing cost计算成本
def J(AL, y):
return -np.sum(np.multiply(np.log(AL), y) + np.multiply(np.log(1 - AL), (1 - y)))/y.shape[1]
backward sigmoid后向乙状结肠
def sigmoidbkwd(dA, cache):
Z = cache
s = 1/(1 + np.exp(-Z))
dZ = dA*s*(1 - s)
return dZ
backward relu`后向relu`
def sigmoidbkwd(dA, cache):
Z = cache
s = 1/(1 + np.exp(-Z))
dZ = dA*s*(1 - s)
return dZ
linear bkwd线性 bkwd
def linearbkwd(dZ, cache):
W, A_prev, b = cache
m = A_prev.shape[1]
dW = np.dot(dZ, A_prev.T)/m
db = np.sum(dZ, axis = 1, keepdims = True)/m
dA_prev = np.dot(W.T, dZ)
return dW, dA_prev, db
backward activation后向激活
def bkwdactivation(dA, cache, activation):
linear_cache, activation_cache = cache
if activation == 'sigmoid':
dZ = sigmoidbkwd(dA, activation_cache)
dW, dA_prev, db = linearbkwd(dZ, linear_cache)
if activation == 'relu':
dZ = relubkwd(dA, activation_cache)
dW, dA_prev, db = linearbkwd(dZ, linear_cache)
return dW, dA_prev, db
backward model向后 model
def bkwdmodel(AL, y, cache):
grads = {}
L = len(cache)
dAL = -(np.divide(y, AL) - np.divide(1 - y,1 - AL))
current_cache = cache[L - 1]
grads['dW' + str(L)], grads['dA' + str(L - 1)], grads['db' + str(L)] = bkwdactivation(dAL, current_cache, 'sigmoid')
for l in reversed(range(L - 1)):
current_cache = cache[l]
dW_temp, dA_prev_temp, db_temp = bkwdactivation(grads['dA' + str(l + 1)], current_cache, 'relu')
grads['dW' + str(l + 1)] = dW_temp
grads['dA' + str(l)] = dA_prev_temp
grads['db' + str(l + 1)] = db_temp
return grads
Optimizing parameters using gradient descent使用梯度下降优化参数
def optimize(grads, params, alpha):
L = len(params)//2
for l in range(1, L + 1):
params['W' + str(l)] = params['W' + str(l)] - alpha*grads['dW' + str(l)]
params['b' + str(l)] = params['b' + str(l)] - alpha*grads['db' + str(l)]
return params
Neural Network Model神经网络 Model
def model(x, y, layer_dims, iters):
costs = []
params = iniparams(layer_dims)
for i in range(1, iters):
AL, caches = fwdmodel(x, params)
cost = J(AL, y)
costs.append(cost)
grads = bkwdmodel(AL, y, caches)
params = optimize(grads, params, 1.2)
if i%100 == 0:
print('Cost after', i,'iterations is:', cost)
costs.append(cost)
return costs, params
calculation (The cost does gets mitigated Cost Vs Iterations(Y,X) curve )计算(成本确实得到缓解Cost Vs Iterations(Y,X) 曲线)
costs, params = model(x, y, [30,8,5,4,4,3,1], 3000)
Prediction function预测 function
def predict(x,params):
AL, cache = fwdmodel(x,params)
predictions = AL >= 0.5
return predictions
And finally when I do this最后当我这样做时
predictions = predict(xt,params)
predictions
I get this:我明白了:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])数组([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
Please tell me where I am wrong请告诉我我错在哪里
here's the link to the dataset这是数据集的链接
Please help me out:D请帮帮我:D
I don't'see why you've transposed your train-test-split output.我不明白您为什么要调换您的火车测试拆分 output。 Why use xt.T, xT anyway?
为什么要使用 xt.T、xT 呢? You should try printing your params(array) output and xt(array) output and see how they are.
您应该尝试打印您的 params(array) output 和 xt(array) output 并查看它们的情况。 Are they similar?
它们相似吗? Does your params output give the right result?
您的参数 output 是否给出正确的结果? Check all of that.
检查所有这些。
The problem with me was that my Neural Network was too deep.我的问题是我的神经网络太深了。 It's a mistake that newbies like me tend to make.
这是像我这样的新手容易犯的错误。 I found this awesome resource that helped me realize this mistake: http://theorangeduck.com/page/neural-network-not-working
我发现这个很棒的资源帮助我意识到了这个错误: http://theorangeduck.com/page/neural-network-not-working
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.