简体   繁体   English

如何解决 LSTM 问题中的 loss: nan &accuracy: 0.0000e+00? 张量流 2.x

[英]How to solve loss: nan & accuracy: 0.0000e+00 in a LSTM problem? Tensorflow 2.x

I'm working in a LSTM problem .我正在处理LSTM 问题 I'm trying to predict MBTI (Myers-Briggs test) personality type based on text classification (there's 16 personality types ).我正在尝试基于文本分类(有16 种性格类型)来预测 MBTI (Myers-Briggs 测试)性格类型。

I have a csv file , which was preprocessed: the stopwords were removed, it was lemmatized, tokenized, sequenced and padded .我有一个csv 文件,它是经过预处理的:停用词被删除,它被词形还原、标记化、排序和填充 The file doesn't have any NaN values and the text sequence have only int numbers .该文件没有任何 NaN 值,文本序列只有 int numbers

However, the problem is generated when trying to train the model I get:但是,在尝试训练我得到的模型时会产生问题:

loss: nan - accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00

在此处输入图片说明

训练模型

在此处输入图片说明


As requested: how's the x, y data and label looks like with the results根据要求:x、y 数据和标签与结果如何

print(validation_label_seq)
[[ 5]
 [10]
 [ 4]
 [ 4]
 [15]
 [12]
 [ 1]...]

print(validation_padded[0])
maxlen = 240
array([  23,  353,  147,  677,    1,    1,  409,   10,  845, 1530,    1,
        103,  107,  998,  117, 1389,   25,    1,   28, 1889,  165,    1,
       1520,   49,  718,   65,   55,   34,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,...], dtype=int32)
print(train_label_seq)
[[ 8]
 [ 9]
 [ 3]
 [ 7]
 [ 4]
 [10]
 [15]
 [11]...]

print(train_data_padded[0])
maxlen = 240
array([ 19, 301, 133, 302, 562, 133,  28, 563, 895, 896, 897, 118,  99,
       564, 397,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0...], dtype=int32)

results = model.evaluate(validation_padded, validation_label_seq)

test = validation_padded[10]
predict = model.predict_classes([test])
print(predict[1])

59/59 [==============================] - 0s 1ms/sample - loss: nan - accuracy: 0.0000e+00
[0]
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/sequential.py:342: RuntimeWarning: invalid value encountered in greater
  return (proba > 0.5).astype('int32')

print(predict)

array([[0],
       [0],
       ...
       [0],
       [0]], dtype=int32)

What I tried?我尝试了什么?

  • I already tried to change to different optimizers我已经尝试更改为不同的优化器
  • Lower the batch size降低批量大小
  • Check for value errors in the dataframe and in the sequences (train and validation data).检查数据帧和序列(训练和验证数据)中的值错误。

Expected output: Maybe I'm building wrong the model, so I will explain which is the main idea.预期输出:也许我构建的模型有误,所以我将解释哪个是主要思想。 I would like to get one output or sixteen outputs, which determines the accuracy of your personality type.我想得到一个输出或十六个输出,这决定了你性格类型的准确性。

1 output:
INTP: 89%

16 outputs:
ENTP: 5% | INTP: 81% | INTJ: 1% | ...

If you'll like to check, here is the code: mbti personality如果你想检查,这里是代码: mbti个性

Dataframe: mbti_df数据框: mbti_df

Any suggestions to improve the question will be considered将考虑任何改进问题的建议

You are using softmax in the code as final output.您在代码中使用 softmax 作为最终输出。 And it is bunch of probability values and check with what you are comparing within this code.这是一堆概率值,并检查您在此代码中比较的内容。 The label encoded targets.标签编码的目标。 They are not matching and that's why it is giving 0 accuracy.它们不匹配,这就是它给出 0 准确度的原因。 I would suggest changing the softmax o/p to correct form so that comparison over accuracy metric give the correct result.我建议将 softmax o/p 更改为正确的形式,以便对accuracy指标的比较给出正确的结果。

Example:例子:

soft max output [0.2, 0.8] Output for other [0 , 1] soft max output [0.2, 0.8]其他[0 , 1]输出

Then it will be mismatch and accuracy will suffer.那么它就会不匹配并且准确性会受到影响。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM