简体   繁体   中英

Is it possible that Precision-Recall curve or a ROC curve is a horizontal line?

I am working on a binary classification task on imbalanced data.

Since the accuracy is not so meaningful in this case. I use Scikit-Learn to compute the Precision-Recall curve and ROC curve in order to evaluate the model performance.

But I found both of the curves would be a horizontal line when I use Random Forest with a lot of estimators, it also happens when I use a SGD classifier to fit it.

The ROC chart is as following:


And the Precision-Recall chart:


Since Random Forest behaves randomly, I don't get a horizontal line in every run, sometimes I also get a regular ROC and PR curve. But the horizontal line is much more common.

Is this normal? Or I made some mistakes in my code?

Here is the snippet of my code:

classifier.fit(X_train, Y_train)
    scores = classifier.decision_function(X_test)
    scores = classifier.predict_proba(X_test)[:,1]

precision, recall, _ = precision_recall_curve(Y_test, scores, pos_label=1)
average_precision = average_precision_score(Y_test, scores)

plt.plot(recall, precision, label='area = %0.2f' % average_precision, color="green")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.title('Precision Recall Curve')
plt.legend(loc="lower right")

Yes, you can. If you perfectly separate the data into two piles, then you go vertically from zero to 1 true-positive-rate without any false positives (the vertical line) as your threshold passes over your pile of true positives, then from 0 to 1 false-positive-rate as your threshold passes over your pile of true negatives.

If you can get the same ROC curve from a test set, you are golden. If you can get the same ROC curve evaluated on 5 different k-fold cross validation test sets, you are platinum.

Along with the other answers, it's possible that you have duplicated your label as a feature in the dataset. Thus, when sampling occurs in RF, you don't always get that feature as a predictor and get a "normal-looking" ROC curve (ie the other features can't predict the label exactly); when you do get the duplicated label/feature in the sample, your model has 100% accuracy by definition.

SGD can have the same issue, in a way that linear regression would fail. In a linear regression, you'd have a singular/near-singular matrix and the estimation would fail. With SGD, since you're re-estimating based on each next point arriving, the math doesn't fail (though, your model will still be suspect).

The other 2 answers are only sufficient conditions of seeing a horizontal line (aka they are the possible causes of a horizontal line, but they are not the only possiblities). Here is the necessary and sufficient conditions:

If you see a horizontal line in PR-curve, it must be at the top and it means examples in the threshold range are all TPs. And the longer the line, the more TP (because a longer line has a larger recall).


Let's denote "TP" as true positive and "PP" as predicted positives, and therefore precision = TP/PP.

A horizontal line means recall increases by some amount and precision unchanged. Let's discuss these 2 things separately:

  1. recall increases by some amount ->
  • TP increases by some amount
  • Suppose TP increases by the smallest amount, 1. Suppose x is the amount of increase in PP. By definition x>=1.
  1. precision unchanged ->
  • (TP+1)/(PP+x)=TP/PP Solving this for x we have x=TP/PP. Because precision = TP/PP <=1, and we just said "by definition x>=1", x has to be 1.

This means both the increase in TP and PP is 1, ie only positive examples are added. Since x=TP/PP, we have precision TP/PP=1 as well. QED.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM