简体   繁体   中英

How to implement AUROC as loss function in tensorflow keras

I'm trying to build a network with tensorflow and keras, for classification with two classes (success or failure). I can play around with the size of the data depending on how I handle NaN data, but for this let's say that my complete input dataset is (502, 68). Most features are continuous, some are binary.

The difficulty is that the data is imbalanced (96% Success).

With how unbalanced the data is, overfitting comes quick, and the result that minimizes loss is "just predict everything as a success". I've played around with class weights, but without very convincing results.

The problem to me is the loss function. That's why I would like to use the AUROC as a loss. The only SO post I've found talking about it is this from 6 years ago which originally made me dismiss the idea. Add AUC as loss function for keras "Well, AUROC isn't differentiable, let's drop this idea".

Since then, I have found some more recent algorithm, most notable roc-star in Pytorch. I would like to apply it as a custom loss function. However, keras takes as custom loss a function that takes y_true and y_pred and returns a value. The roc-star algorithm takes the gradient and values at the previous iteration as input. Do you know a way around this ?

I'm using a simple network created with keras.models.Sequential.

So my question is at several layers, feel free to respond to any of them while ignoring the others =p

  1. Does anyone know any other, simple way to use AUROC as a loss function ?
  2. Am I too fixated on the AUROC ? I guess I could make a simpler, and more easily differentiable function, based on the confusion matrix that could work as well.
  3. How can I implement the roc-star algorithm to the custom loss function ?

Edit: I realized that I did not provide a link to the roc-star algorithm: https://github.com/iridiumblue/roc-star

I'm attempting to use it in my tensorflow model atm. It's going, subpar. The dataset I'm using is complex and hard to predict, and EDA and other models (ie pca and decision tree), are resulting similar prediction %s. Nonetheless, here's what I've done to implement it. By advised that tf keras and kerasregressor are different and have different documentation. But they work similarly and can basically do the same.

def auroc(yTrainSet, yValidationSet): return tf.py_function(roc_auc_score, (yTrainSet, yValidationSet), tf.double)

import itertools
import tensorflow as tf
import pandas as pd
from tensorflow import keras
from tensorflow.keras import models
from tensorflow.keras import datasets
from tensorflow.keras import layers
from sklearn.metrics import roc_auc_score
# 1- Instantiate Model
ourModel = keras.Sequential()
# 2- Specify Shape of First Layer
ourModel.add(layers.Dense(512, activation = 'relu', input_shape = ourInputShape))
# 3- Add the layers
ourModel.add(layers.Dense(3, activation= 'softmax')) #softmax returns array of probability scores (num prior), and in this case we have to predict either CSCANCEL, MEMBERCANCEL, ACTIVE)

ourModel.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy', auroc]) #auroc used here

Once this model is made like this you just compile/run it per normal, etc. I've found very little difference in overall performance with this implementation, but figured I'd share nonetheless. Best of luck to you

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM