简体   繁体   English

tf.keras给出nan丢失和非验证错误

[英]tf.keras giving nan loss and non validation error

I'm trying to program Deep neural Network using tf.keras API, I feel my model is right I have removed all the nan values but I'm still getting NAN values during training. 我正在尝试使用tf.keras API编程深度神经网络,我觉得我的模型是正确的,我删除了所有的nan值,但是在训练过程中仍然得到NAN值。 The Data set I have been using is Wiscon Cancer dataset from UCI here 我一直在使用的数据集是由UCI WISCON癌症数据集在这里

My Code: 我的代码:

from tensorflow import keras
import pandas as pd
import tensorflow as tf

df = pd.read_csv('breastc.csv.csv')
df.dropna()
id_ = df['ID'].tolist()
del df['ID']
labels = df['Class'].tolist()
import numpy as np
del df['Class']
column_list='Compactness'
df[column_list] = df[column_list].apply(pd.to_numeric, errors='coerce')

model = keras.Sequential()
model.add(keras.layers.Dense(64,activation='relu',input_shape = (9,)))
model.add(keras.layers.Dense(64,activation='relu'))
model.add(keras.layers.Dense(1,activation='softmax'))

model.summary()

X=df.iloc[:].values

model.compile(optimizer=tf.train.AdamOptimizer(0.01),
              loss='mse',       # mean squared error
              metrics=['mae'])
model.fit(X,labels,batch_size=32,epochs=10,validation_split=0.2)

After the fit statement I'm getting the following results 在fit语句之后,我得到以下结果

 Train on 559 samples, validate on 140 samples Epoch 1/10 559/559 [==============================] - 0s 599us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 2/10 559/559 [==============================] - 0s 82us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 3/10 559/559 [==============================] - 0s 86us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 4/10 559/559 [==============================] - 0s 84us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 5/10 559/559 [==============================] - 0s 87us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 6/10 559/559 [==============================] - 0s 83us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 7/10 559/559 [==============================] - 0s 80us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 8/10 559/559 [==============================] - 0s 77us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 9/10 559/559 [==============================] - 0s 73us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan Epoch 10/10 559/559 [==============================] - 0s 62us/step - loss: nan - mean_absolute_error: nan - val_loss: nan - val_mean_absolute_error: nan <tensorflow.python.keras._impl.keras.callbacks.History at 0x15c94a80cc0> 

As we can see there is no training happening. 如我们所见,没有培训在进行。 Please guide me. 请指导我。

Yours Sincerely, Vidit Shah 您诚挚的,Vidit Shah

when you have a classification problem with 2 or more classes and you want to choose only one of those you should usually put the last layer to have the number of output neurons equal to the number of classes and set its activation function to softmax (so you will have a distribution over the classes as output). 当您遇到2个或更多类的分类问题并且只想选择其中一个时,通常应将最后一层的输出神经元数与类数相等,并将其激活函数设置为softmax(因此,将在所有类别上作为输出分布)。 Once you get the output distribution you can consider as prediction the index in the output vector (ie the class) that received the highest probability. 一旦获得输出分布,就可以将接收概率最高的输出向量(即类)中的索引视为预测。

Another thing you should change is the loss function, when you use a softmax as output activation function you need to use the crossentropy loss that measure the distance between two distributions (in this case the output from your network and the gold distribution - all 0s and a 1 at the index that correspond to the correct class). 您应该更改的另一件事是损失函数,当您使用softmax作为输出激活函数时,您需要使用交叉熵损失来测量两个分布之间的距离(在这种情况下,网络的输出和黄金分布-全为0,对应正确类别的索引处为1)。 In case you have the gold label represented as indices of the correct class (0 or 1 in your case) you can use those indices directly by setting sparse_corssentropy as loss function (it will automatically transform your indices in one-hot vectors). 如果您将黄金标签表示为正确类别的索引(在您的情况下为0或1),则可以通过将sparse_corssentropy设置为损失函数来直接使用这些索引(它将自动将索引转换为一热向量)。

To wrap everything up, you can transform your code as follow: 要包装所有内容,可以按以下方式转换代码:

from tensorflow import keras
import pandas as pd
import tensorflow as tf

df = pd.read_csv('breastc.csv.csv')
df.dropna()
id_ = df['ID'].tolist()
del df['ID']
labels = df['Class'].tolist()
import numpy as np
del df['Class']
column_list='Compactness'
df[column_list] = df[column_list].apply(pd.to_numeric, errors='coerce')

model = keras.Sequential()
model.add(keras.layers.Dense(64,activation='relu',input_shape = (9,)))
model.add(keras.layers.Dense(64,activation='relu'))
model.add(keras.layers.Dense(2,activation='softmax'))

model.summary()

X=df.iloc[:].values

model.compile(optimizer=tf.train.AdamOptimizer(0.01),
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(X,labels,batch_size=32,epochs=10,validation_split=0.2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM