简体   繁体   English

如何使用 python 打印精度、召回率、fscore?

[英]how can i print precision, recall, fscore using python?

I want to calculate and print precision, recall, fscore and support using sklearn.metrics in python.我想在 python 中使用 sklearn.metrics 计算和打印精度、召回率、fscore 和支持。 I am doig NLP so my y_test and y_pred are basicaly words before the vectorisation step.我是 doig NLP,所以我的 y_test 和 y_pred 基本上是向量化步骤之前的单词。

below some information that can help you :下面是一些可以帮助您的信息:

y_test:  [0 0 0 1 1 0 1 1 1 0]
y_pred [0.86 0.14 1.   0.   1.   0.   0.04 0.96 0.01 0.99 1.   0.   0.01 0.99
 0.41 0.59 0.02 0.98 1.   0.  ]

x_train 50
y_train 50
x_test 10
y_test 10
x_valid 6
y_valid 6

y_pred dimension:  (20,)
y_test dimension:  (10,)

the full trackback error :完整的引用错误:

  Traceback (most recent call last):
  File "C:\Users\iduboc\Documents\asd-dev\train.py", line 324, in <module>
    precision, recall, fscore, support = score(y_test, y_pred)
  File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\metrics\classification.py", line 1415, in precision_recall_fscore_support
    pos_label)
  File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\metrics\classification.py", line 1239, in _check_set_wise_labels
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\metrics\classification.py", line 71, in _check_targets
    check_consistent_length(y_true, y_pred)
  File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\utils\validation.py", line 205, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [10, 20]

my code :我的代码:

 from sklearn.metrics import precision_recall_fscore_support as score
    precision, recall, fscore, support = score(y_test, y_pred)
    print('precision: {}'.format(precision))
    print('recall: {}'.format(recall))
    print('fscore: {}'.format(fscore))
    print('support: {}'.format(support))

My code to predict the values :我的代码来预测值:

elif clf == 'rndforest':

    # No validation data in rnd forest
    x_train = np.concatenate((x_train, x_valid))
    y_train = np.concatenate((y_train, y_valid))

    model = RandomForestClassifier(n_estimators=int(clf_params['n_estimators']),
                                   max_features=clf_params['max_features'])
    model.fit(pipe_vect.transform(x_train), y_train)

    datetoday = datetime.today().strftime('%d-%b-%Y-%H_%M')
    model_name_save = abspath(os.path.join("models", dataset,  name_file + '-' + 
    vect + reduction + '-rndforest'\
                                   + datetoday + '.pickle'))
    print("Model d'enregistrement : ", model_name_save)




    x_test_vect = pipe_vect.transform(x_test)

    y_pred = model.predict_proba(x_test_vect)  

The error is due to the different sizes of the predicted and ground truth vectors.错误是由于预测向量和地面实况向量的大小不同造成的。 The function precision_recall_fscore_support only works if these sizes are the same.函数precision_recall_fscore_support仅在这些大小相同时才有效。

See the docs:查看文档:

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html

Also, the aforementioned function expects to receive non-continuous values, otherwise.此外,上述函数期望接收非连续值,否则。 If you pass as an argument a list with floats between 0 and 1 ( y_pred list) you will have the next error:如果您将浮点数介于 0 和 1 之间的列表( y_pred列表)作为参数传递,则会出现下一个错误:

ValueError: Classification metrics can't handle a mix of binary and continuous targets

The example code that produced the error is this:产生错误的示例代码是这样的:

y_test =  [0., 0., 0., 1., 1.]
y_pred = [0.86, 0.14, 1., 0., 1.]

from sklearn.metrics import precision_recall_fscore_support as score

precision, recall, fscore, support = score(y_test, y_pred)
print('precision: {}'.format(precision))
print('recall: {}'.format(recall))
print('fscore: {}'.format(fscore))
print('support: {}'.format(support))

So if you want to calculate these metrics you have to decide at some manner which values of the predicted vector are 1 (positive prediction) and which are 0 (negative prediction).因此,如果您想计算这些指标,您必须以某种方式决定预测向量的哪些值为 1(正预测),哪些值为 0(负预测)。 For example, you can use a threshold (eg 0.5), or multiple thresholds and then select the best one or plot a curve with the different metrics at different threshold levels (eg 0.1 , 0.2, 0.3 and so on).例如,您可以使用一个阈值(例如 0.5)或多个阈值,然后选择最佳阈值或绘制具有不同阈值级别(例如 0.1、0.2、0.3 等)的不同指标的曲线。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 precision_recall_fscore_support返回相同值的精度调用 - precision_recall_fscore_support return the same value precision recall 使用cross_validation.cross_val_score和metrics.precision_recall_fscore_support - Using cross_validation.cross_val_score with metrics.precision_recall_fscore_support 如何通过LSTM或卷积神经网络理解精确调用fscore数组? - How to comprehend the precision-recall-fscore array over an LSTM or Convolutional neural network? sklearn.metrics.precision_recall_fscore_support的输出解释 - Interpretation of the output of sklearn.metrics.precision_recall_fscore_support 如何计算此模型的召回率、准确率和 f 分数? - how can I calculate recall, precision and f-score for this model? 如何使用python计算Precision、Recall和F-score? - How to calculate Precision, Recall and F-score using python? 如何计算python中2个列表的精度和召回率 - How to calculate precision and recall of 2 lists in python 如何在Python中提高精度和召回不平衡数据集 - How to improve Precision and Recall on Imbalanced Dataset in Python 我建立了一个棉花病害预测 model 但我想知道 model 的精度和召回率如何计算 - I build a cotton disease prediction model but I want to know the precision , recall of the model how can I calculate 与GridSearchCV一起使用时,fbeta_score和precision_recall_fscore_support之间的区别是什么? - Difference between fbeta_score and precision_recall_fscore_support when used with GridSearchCV?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM