[英]Why is my loss trending down while my accuracy is going to zero?
I am trying to practice my machine learning skills with Tensorflow/Keras but I am having trouble around fitting the model.我正在尝试使用 Tensorflow/Keras 练习我的机器学习技能,但是我在拟合模型方面遇到了麻烦。 Let me explain what I've done and where I'm at.
让我解释一下我做了什么以及我在哪里。
I am using the dataset from Kaggle's Costa Rican Household Poverty Level Prediction Challenge我正在使用来自 Kaggle 的哥斯达黎加家庭贫困水平预测挑战的数据集
Since I am just trying to get familiar with the Tensorflow workflow, I cleaned the dataset by removing a few columns that had a lot of missing data and then filled in the other columns with their mean.由于我只是想熟悉 Tensorflow 工作流程,因此我通过删除一些包含大量缺失数据的列来清理数据集,然后用它们的平均值填充其他列。 So there are no missing values in my dataset.
所以我的数据集中没有缺失值。
Next I loaded the new, cleaned, csv in using make_csv_dataset
from TF.接下来,我使用 TF 的
make_csv_dataset
加载了新的、清理过的 csv。
batch_size = 32
train_dataset = tf.data.experimental.make_csv_dataset(
'clean_train.csv',
batch_size,
column_names=column_names,
label_name=label_name,
num_epochs=1)
I set up a function to return my compiled model like so:我设置了一个函数来返回我编译的模型,如下所示:
f1_macro = tfa.metrics.F1Score(num_classes=4, average='macro')
def get_compiled_model():
model = tf.keras.Sequential([
tf.keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(137,)), # input shape required
tf.keras.layers.Dense(256, activation=tf.nn.relu),
tf.keras.layers.Dense(4, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=[f1_macro, 'accuracy'])
return model
model = get_compiled_model()
model.fit(train_dataset, epochs=15)
Below is the result of that下面是结果
A link to my notebook is Here我的笔记本的链接在这里
I should mention that I strongly based my implementation on Tensorflow's iris data walkthrough我应该提到,我的实现强烈基于 Tensorflow 的 iris 数据演练
Thank you!谢谢!
After a while, I was able to find the issues with your code they are in the order of importance.一段时间后,我能够找到您的代码的问题,它们按重要性排序。 (First is of highest importance)
(第一个是最重要的)
You are doing multi-class classification (not binary classification).您正在进行多类分类(不是二元分类)。 Therefore your loss should be
categorical_crossentropy
.因此,您的损失应该是
categorical_crossentropy
。
You are not onehot encoding your labels.你不是onehot 编码你的标签。 Using
binary_crossentropy
and having labels as a numerical ID is definitely not the way forward.使用
binary_crossentropy
并将标签作为数字 ID 绝对不是前进的方向。 Instead, you should do onehot encode your labels and solve this like a multi-class classification problem.相反,您应该对标签进行 onehot 编码,并像多类分类问题一样解决这个问题。 Here's how you do that.
这是你如何做到的。
def pack_features_vector(features, labels):
"""Pack the features into a single array."""
features = tf.stack(list(features.values()), axis=1)
return features, tf.one_hot(tf.cast(labels-1, tf.int32), depth=4)
x = train_df[feature_names].values #returns a numpy array
min_max_scaler = preprocessing.StandardScaler()
x_scaled = min_max_scaler.fit_transform(x)
train_df = pd.DataFrame(x_scaled)
These issues should set your model straight.这些问题应该让你的模型变得直截了当。
As the other comment does give some best practice advice that are definitely worth considering, this comment concentrates on your observation that your loss and accuracy are decoupled - which is counter intuitive at first.
由于其他评论的确提供了一些绝对值得考虑的最佳实践建议,因此该评论集中在您的观察上,即您的损失和准确性是相互分离的-首先是反直观的。
Have a look at metrics.py
, there you can find definition of all available metrics including different types of accuracy. 查看
metrics.py
,您可以在其中找到所有可用度量的定义,包括不同类型的准确性。
The type of accuracy
is determined based on the objective function, see training.py
. accuracy
类型取决于目标函数,请参阅training.py
。 The default choice for binary_accuracy
is as follows: binary_accuracy
的默认选择如下:
if output_shape[-1] == 1 or self.loss_functions[i] == objectives.binary_crossentropy:
# case: binary accuracy
acc_fn = metrics_module.binary_accuracy
And binary_accuracy
is defined as follows in the metric: 在指标中,
binary_accuracy
定义如下:
def binary_accuracy(y_true, y_pred):
'''Calculates the mean accuracy rate across all predictions for binary
classification problems.
'''
return K.mean(K.equal(y_true, K.round(y_pred)))
In the objective function it's this way: 在目标函数中是这样的:
def binary_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0):
y_pred = K.constant(y_pred) if not K.is_tensor(y_pred) else y_pred
y_true = K.cast(y_true, y_pred.dtype)
if label_smoothing is not 0:
smoothing = K.cast_to_floatx(label_smoothing)
y_true = K.switch(K.greater(smoothing, 0),
lambda: y_true * (1.0 - smoothing) + 0.5 * smoothing,
lambda: y_true)
return K.mean(K.binary_crossentropy(y_true, y_pred, from_logits=from_logits), axis=-1)
So to wrap it up: 所以总结一下:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.