[英]Difference between predict vs predict_proba in scikit-learn
Suppose I have created a model, and my target variable is either 0
, 1
or 2
.假设我创建了一个 model,我的目标变量是
0
、 1
或2
。 It seems that if I use predict
, the answer is either of 0, or 1 or 2. But if I use predict_proba
, I get a row with 3 cols for each row as follows, for example似乎如果我使用
predict
,答案是 0 或 1 或 2。但如果我使用predict_proba
,我会得到一行,每行有 3 个列,例如
model = ... Classifier # It could be any classifier
m1 = model.predict(mytest)
m2= model.predict_proba(mytest)
# Now suppose m1[3] = [0.6, 0.2, 0.2]
Suppose I use both predict and predict_proba
.假设我同时使用 predict 和
predict_proba
。 If in index 3, I get the above result with the result of predict_proba
, in index 3 of the result of predict I should see 0. Is this the case?如果在索引 3 中,我通过
predict_proba
的结果得到上述结果,在 predict 结果的索引 3 中我应该看到 0。是这样吗? I am trying to understand how using both predict
and predict_proba
on the same model relate to each other.我试图了解在同一个
predict_proba
上同时使用predict
和 predict_proba 是如何相互关联的。
predict()
is used to predict the actual class (In your case one of 0
, 1
or 1
). predict()
用于预测实际的 class (在您的情况下为0
、 1
或1
之一)。predict_proba()
is used to predict the class probabilities predict_proba()
用于预测class 概率From the example output that you shared,从您分享的示例 output 中,
predict()
would output class 0
since the class probability for 0
is 0.6. predict()
将 output class 0
因为0
的 class 概率为 0.6。[0.6, 0.2, 0.2]
is the output of predict_proba
that simply denotes that the class probability for classes 0
, 1
and 1
are 0.6
, 0.2
and 0.2
respectively. 0.6
, 0.2
[0.6, 0.2, 0.2]
是 predict_proba 的predict_proba
简单地表示类别0
和1
的1
概率分别为 0.6、0.2 和0.2
。 Now as the documentation mentions for predict_proba
, the resulting array is ordered based on the labels you've been using:现在,正如文档中提到的
predict_proba
,结果数组是根据您一直使用的标签排序的:
The returned estimates for all classes are ordered by the label of classes.
所有类的返回估计值按类的 label 排序。
Therefore in your case where your class labels are [0, 1, 2]
the corresponding output of predict_proba
will contain the corresponding probabilities.因此,在您的 class 标签为
[0, 1, 2]
的情况下, predict_proba 的相应predict_proba
将包含相应的概率。 0.6
is the probability of the instance to be classified as 0
and 0.2
are the probabilities that the instance is categorised as 1
and 2
respectively. 0.6
是实例被分类为0
的概率, 0.2
是实例分别被分类为1
和2
的概率。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.