简体   繁体   中英

Why does implementing class weights make the model worse

I am trying to do binary classification, and the one class (0) is approximately 1 third of the other class (1). when I run the raw data through a normal feed forward neural network, the accuracy is about 0.78. However, when I implement class_weights, the accuracy drops to about 0.49. The roc curve also seems to do better without the class_weights. Why does this happen, and how can i fix it?

II have already tried changing the model, and implementing regularization, and dropouts, etc. But nothing seems to change the overall accuracy

this is how i get my weights: class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train) class_weight_dict = dict(enumerate(class_weights))

Here is the results without the weights:

在此处输入图像描述 在此处输入图像描述 在此处输入图像描述

Here is with the weights:

在此处输入图像描述 在此处输入图像描述 在此处输入图像描述

I would expect the results to be better with the class_weights but the opposite seems to be true. Even the roc does not seem to do any better with the weights.

Due to the class imbalance a very weak baseline of always selecting the majority class will get accuracy of approximately 75%.

The validation curve of the network that was trained without class weights appears to show that it is picking a solution close to always selecting the majority class. This can be seen from the network not improving much over the validation accuracy it gets in the 1st epoch.

I would recommend looking into the confusion matrix, precision and recall metrics to get more information about which model is better.

This answer seems too late, but I hope it is helpful anyway. I just want to add four points:

  • Since the proportion of your data is minority: 25% and majority: 75%, accuracy is computed as:

accuracy = True positive + true negative / (true positive + true negative + false positive + false negative)

Thus, if you look at the accuracy as a metric, most likely any models could achieve around 75% accuracy by simply predicting the majority class all the time. That's why on the validation set, the model was not able to predict correctly.

  • While with class weights, the learning curve was not smooth but the model actually started to learn and it failed from time to time on the validation set.

  • As it was already stated, perhaps changing metrics such as F1 score would help. I saw that you are implementing tensorflow, tensorflow has metric F1 score on their Addons, you can find it on their documentation here . For me, I looked at the classfication report in scikit learn, let's say you want to see the model's performance on the validation set (X_val, y_val):

from sklearn.metrics import classification_report
y_predict = model.predict(X_val, batch_size=64, verbose=1
print(classification_report(y_val, y_predict))
  • Other techniques you might want to try such as implementing upsampling and downsampling at the same time can help, or SMOTE.

Best of luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM