简体   繁体   English

精度,召回率,F分数要求相等的输入

[英]Precision, Recall, F-score requiring equal inputs

I am looking at precision, recall, and f-score using scikit-learn using: 我正在使用scikit-learn通过以下方式查看精度,召回率和f分数:

from sklearn.metrics import `precision_score`

Then: 然后:

y_true = np.array(["one", "two", "three"])
y_pred = np.array(["one", "two"])

precision = precision_score(y_true, y_pred, average=None)
print(precision)

The error returned is: 返回的错误是:

ValueError: Found input variables with inconsistent numbers of samples: [3, 2] ValueError:找到输入样本数量不一致的输入变量:[3,2]

Due to the imbalanced input arrays, why does scikit-learn require an equal amount of inputs? 由于输入数组不平衡,为什么scikit-learn需要相等数量的输入? Particularly when evaluating recall (which I would have thought was taking more guesses than answers). 尤其是在评估召回率时(我本以为是猜测多于答案)。

I can implement my own metrics or just reduce the arrays so they match. 我可以实现自己的指标,也可以减少数组以使其匹配。 I want to be sure there is no underlying reason why I should not? 我想确定没有根本原因不应该这样做吗?

It really depends what your y_true and y_pred mean in your case. 这实际上取决于您的y_truey_pred在您的情况下的含义。 But generally, y_true will be a vector indicating what the true value is supposed to be for every element of y_pred . 但通常, y_true将是一个向量,指示y_pred 每个元素的真实值是y_pred I think this is not your case, and to use scikit-learn 's metrics, you would need to put them in that format. 我认为这不是您的情况,要使用scikit-learn的指标,您需要将其放入该格式。

So in the case of binary classification, precision will be: 因此,对于二进制分类,精度为:

correct_classifications = (y_true == y_pred).astype(int)
precision = sum(y_pred * correct_classifications) / sum(y_pred)

Here you see that you need y_true and y_pred to be the same length. 在这里,您需要将y_truey_pred的长度设置为相同。

That is quite simply because sklearn is playing the safe role here. 这很简单,因为sklearn在这里扮演着安全的角色。

It doesn't make sense that you didn't do 100% of the predictions for the test set. 您没有对测试集进行100%的预测是没有道理的。

Let's say you have 1M data points in your dataset but you only predict 200k, are those the first 200k points? 假设您的数据集中有1M个数据点,但您只能预测200k,那是前200k点吗? The last? 最后? Spread all over? 遍布? How would the library know which matches which? 图书馆如何知道哪个匹配哪个?

You have to have a 1:1 correspondance at the input of the metrics calculation. 在指标计算的输入处必须具有1:1的对应关系。 If you don't have predictions for some points throw them out (but make sure you know why you don't have such predictions in the first place, if it's not a problem with the pipeline) - you don't want to say you have 100% recall at 1% precision and in the end you only predicted for 10% of the dataset. 如果您对某些点没有预测,请将其排除(但要确保首先知道为什么没有这种预测,如果管道没有问题)-您不想说您具有1%的精度的100%召回率,最终您仅预测了10%的数据集。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM