简体   繁体   中英

Total number of TP, TN, FP & FN do not sum up to total number of observed values

I was going through the Classification on imbalanced data by TensorFlow. Here in this tutorial they have used Kaggle's Credit Card Fraud Detection . In this section you could see that the number of training examples are 182276 and number of validation samples are 45569 . To evaluate the baseline model they have used Keras's inbuilt metrics - TruePositive, FalsePositive, TrueNegative, FalseNegative.

However if you look at the training logs in train the model section then you can see that the sum of FP+TP+FN+TN is not equal to number of training examples. Nor the sum is equal to number of validation examples for validation data.

Part 1

EPOCH 1

TP = 64
FP = 25
TN = 139431.9780
FN = 188.3956
TP+FP+TN+FN = 139709.3736

The above sum is nowhere close to 182276. Same is true for all the subsequent epochs. Why is this the case?

Part 2

As the number of epoch increases, the total sum decreases further. For example compare the values for epoch 2 and 1. EPOCH 2

TP - 25
FP - 5.67
TN - 93973.1538
FN - 136.2967 
TP+FP+TN+FN = 94135.1205

The total sum is now reduced further by 45574. Same is true for epochs lower down the order.

  1. Shouldn't the total sum be the same?
  2. If not then why does it keep on decreasing?

Part 3

Why are the values for TP, FP, FN, TN in both training and validation floating numbers? As per my understanding these should always be integer. As per the explanation in the Understanding useful metrics the values represent count and should hence be integers.

I was going through the Classification on imbalanced data by TensorFlow. Here in this tutorial they have used Kaggle's Credit Card Fraud Detection . In this section you could see that the number of training examples are 182276 and number of validation samples are 45569 . To evaluate the baseline model they have used Keras's inbuilt metrics - TruePositive, FalsePositive, TrueNegative, FalseNegative.

However if you look at the training logs in train the model section then you can see that the sum of FP+TP+FN+TN is not equal to number of training examples. Nor the sum is equal to number of validation examples for validation data.

Part 1

EPOCH 1

TP = 64
FP = 25
TN = 139431.9780
FN = 188.3956
TP+FP+TN+FN = 139709.3736

The above sum is nowhere close to 182276. Same is true for all the subsequent epochs. Why is this the case?

Part 2

As the number of epoch increases, the total sum decreases further. For example compare the values for epoch 2 and 1. EPOCH 2

TP - 25
FP - 5.67
TN - 93973.1538
FN - 136.2967 
TP+FP+TN+FN = 94135.1205

The total sum is now reduced further by 45574. Same is true for epochs lower down the order.

  1. Shouldn't the total sum be the same?
  2. If not then why does it keep on decreasing?

Part 3

Why are the values for TP, FP, FN, TN in both training and validation floating numbers? As per my understanding these should always be integer. As per the explanation in the Understanding useful metrics the values represent count and should hence be integers.

I was experiencing a similar issue for me the sum of TP, TN, FP, and FN doubled each epoch and were non integers. my solution was that my model was built using tensor flows keras but I imported keras directly.

so instead of

import keras

use

from tensorflow import keras

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM