简体   繁体   English

Python Scikit-调用sklearn.metrics.precision_recall_curve时输入形状错误

[英]Python Scikit - bad input shape when calling sklearn.metrics.precision_recall_curve

I'm trying to build a PRC (precision-recall curve) for a CatBoostClassifier . 我正在尝试为CatBoostClassifier构建PRC(精确调用曲线)。

But when I'm calling sklearn.metrics.precision_recall_curve(y_test, y_score) I'm getting ValueError: bad input shape (11912, 2) . 但是当我调用sklearn.metrics.precision_recall_curve(y_test, y_score)我得到ValueError: bad input shape (11912, 2)

What could be wrong with my current approach? 我目前的方法有什么问题? And what do I need to fix here to provide a correct shape? 我需要在此处修复什么以提供正确的形状?

import sklearn 
from sklearn import metrics 
y_score = model.predict_proba(X_test) 
prc_auc = sklearn.metrics.precision_recall_curve(y_test, y_score)

//Here is how I build a model //这是我建立模型的方式

model = CatBoostClassifier( 
iterations=50, 
random_seed=63, 
learning_rate=0.15, 
custom_loss=['Accuracy', 'Precision', 'Recall', 'AUC']
) 

model.fit( 
X_train, y_train, 
cat_features=cat_features, 
eval_set=(X_test, y_test), 
verbose=10, 
plot=True 
);   

The trivial answer is that CatBoostClassifier.model.predict_proba returns a 2d array; 最简单的答案是CatBoostClassifier.model.predict_proba返回2d数组。 sklearn.model.precision_recall_curve requires a 1d array (or a 2d array with one column, whichever). sklearn.model.precision_recall_curve需要一个1d数组(或带有一列的2d数组)。

The documentation for CatBoostClassifier says that predict_proba() returns numpy.array , and provides no other information about this method. CatBoostClassifier的文档说, predict_proba()返回numpy.array ,并且不提供有关此方法的其他信息。 So I hate the documentation for this package now. 因此,我现在讨厌此软件包的文档。

Walking through some poorly-commented code gets me to: 通过一些评论欠佳的代码,我可以:

    if prediction_type == 'Probability':
        predictions = np.transpose([1 - predictions, predictions])
        return predictions

I'm guessing that column 0 is the probability of class 0, and column 1 is the probability of class 1. So pick whichever of those things your test aligns with and use that column only. 我猜第0列是类别0的概率,而第1列是类别1的概率。因此,请选择测试与之匹配的任何事物,并仅使用该列。

prc_auc = sklearn.metrics.precision_recall_curve(y_test, y_score[:, 1])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 sklearn.metrics.precision_recall_curve:为什么精度和重新调用返回的数组而不是单个值 - sklearn.metrics.precision_recall_curve: Why are the precision and recall returned arrays instead of single values 不同阈值下的特异性(与 sklearn.metrics.precision_recall_curve 相同) - Specificity at different thresholds (in the same way as sklearn.metrics.precision_recall_curve) sklearn.metrics.precision_recall_curve 中的估计概率(probas_pred)是多少? - What is estimated probability(probas_pred) in sklearn.metrics.precision_recall_curve? 绘制阈值(precision_recall 曲线)matplotlib/sklearn.metrics - Plotting Threshold (precision_recall curve) matplotlib/sklearn.metrics Python sklearn输入形状不良 - Python sklearn bad input shape Keras分类器的Sklearn精度,召回率和FMeasure度量 - Sklearn Metrics of precision, recall and FMeasure on Keras classifier 如何在sklearn.metrics中为函数precision_recall_curve和roc_curve获得相同的阈值 - How to get the same thresholds values for both functions precision_recall_curve and roc_curve in sklearn.metrics Python SKLearn:预测序列时出现“输入形状错误”错误 - Python SKLearn: 'Bad input shape' error when predicting a sequence Scikit的平均精度得分输入形状不良 - Scikit's Average Precision Score bad input shape 在scikit-learn中使用交叉验证时绘制Precision-Recall曲线 - Plotting Precision-Recall curve when using cross-validation in scikit-learn
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM