How to use f1-score for CrossValidator evaluator in a binary problem(BinaryClassificationEvaluator) in pyspark 2.3

Question

My use case is a common use case: binary classification with unbalanced labels so we decided to use f1-score for hyper-param selection via cross-validation, we are using pyspark 2.3 and pyspark.ml, we create a CrossValidator object but for the evaluator, the issue is the following:

BinaryClassificationEvaluator does not have f1 score as evaluation metric.
MulticlassClassificationEvaluator has f1 score, but is returning wrong results, my guess is it calculates f1 for every class(in this case just 2) and returns some kind of average across them, since the negative class(y=0) is predominant it generates high f1 but the model is really bad(f1 score for positive class is 0)
MulticlassClassificationEvaluator added in recent versions the parameter evaluator.metricLabel which I think allows to specify which label to use(in my case I would set it to 1), but it is not available on spark 2.3

But the problem is: I use a corporate/enterprise spark cluster with no plans to upgrade current version(2.3) so the question is: how can I use f1 score in a CrossValidator evaluator for binary case considering we are restricted to spark 2.3

Answer 1

If you could use Spark v3.0+, the easiest way would be using F-measure by label metric and specifying the label (and setting beta to 1):

evaluator = MulticlassClassificationEvaluator(metricName='fMeasureByLabel', metricLabel=1, beta=1.0)

But since you are restricted to v2.3, you can either

reimplement CrossValidator functionality. pyspark.mllib.evaluation.MulticlassMetrics has fMeasure by label method. See the example for reference.
change your metric to areaUnderPR from BinaryClassificationEvaluator , which is sort of "goodness of model" metric, and should do the job for you (re unbalanced labels). This blogpost compares F1 and AUC-PR.

Answer 2

You can create a class for this. I also had the same problem with my company's spark 2.4 so I tried to make an F1 score evaluator for binary classifications. I had to specify the .evaluate and .isLargerBetter methods for the new class. Here is a sample code when I tried on this dataset :

class F1BinaryEvaluator():

    def __init__(self, predCol="prediction", labelCol="label", metricLabel=1.0):
        self.labelCol = labelCol
        self.predCol = predCol
        self.metricLabel = metricLabel

    def isLargerBetter(self):
        return True

    def evaluate(self, dataframe):
        tp = dataframe.filter(self.labelCol + ' = ' + str(self.metricLabel) + ' and ' + self.predCol + ' = ' + str(self.metricLabel)).count()
        fp = dataframe.filter(self.labelCol + ' != ' + str(self.metricLabel) + ' and ' + self.predCol + ' = ' + str(self.metricLabel)).count()
        fn = dataframe.filter(self.labelCol + ' = ' + str(self.metricLabel) + ' and ' + self.predCol + ' != ' + str(self.metricLabel)).count()
        return tp / (tp + (.5 * (fn +fp)))


f1_evaluator = F1BinaryEvaluator()

from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.ml.classification import GBTClassifier
gbt = GBTClassifier()
paramGrid = (ParamGridBuilder()
             .addGrid(gbt.maxDepth, [3, 5, 7])
             .addGrid(gbt.maxBins, [10, 30])
             .addGrid(gbt.maxIter, [10, 15])
             .build())
cv = CrossValidator(estimator=gbt, estimatorParamMaps=paramGrid, evaluator=f1_evaluator, numFolds=5)

cvModel = cv.fit(train)
cv_pred = cvModel.bestModel.transform(test)

The CV process ran with no problems, though I don't know about the performance. I also tried to compare the evaluator with sklearn.metrics.f1_score and the values are close.

from sklearn.metrics import f1_score
print("made-up F1 Score evaluator : ", f1_evaluator.evaluate(cv_pred))
print("sklearn F1 Score evaluator : ", f1_score(cv_pred.select('label').toPandas(), cv_pred.select('prediction').toPandas()))

made-up F1 Score evaluator :  0.9363636363636364
sklearn F1 Score evaluator :  0.9363636363636363

How to use f1-score for CrossValidator evaluator in a binary problem(BinaryClassificationEvaluator) in pyspark 2.3

Question

2 answers

solution1
1 2021-07-06 13:53:05

solution2
1 ACCPTED 2021-07-07 05:08:52

How to use f1-score for CrossValidator evaluator in a binary problem(BinaryClassificationEvaluator) in pyspark 2.3

Question

2 answers

solution1 1 2021-07-06 13:53:05

solution2 1 ACCPTED 2021-07-07 05:08:52

solution1
1 2021-07-06 13:53:05

solution2
1 ACCPTED 2021-07-07 05:08:52