scikit-learn 中 predict 与 predict_proba 之间的差异

Question

Suppose I have created a model, and my target variable is either 0 , 1 or 2 .假设我创建了一个 model，我的目标变量是0 、 1或2 。 It seems that if I use predict , the answer is either of 0, or 1 or 2. But if I use predict_proba , I get a row with 3 cols for each row as follows, for example似乎如果我使用predict ，答案是 0 或 1 或 2。但如果我使用predict_proba ，我会得到一行，每行有 3 个列，例如

   model = ... Classifier       # It could be any classifier
   m1 = model.predict(mytest)
   m2= model.predict_proba(mytest)

   # Now suppose  m1[3] = [0.6, 0.2, 0.2]

Suppose I use both predict and predict_proba .假设我同时使用 predict 和predict_proba 。 If in index 3, I get the above result with the result of predict_proba , in index 3 of the result of predict I should see 0. Is this the case?如果在索引 3 中，我通过predict_proba的结果得到上述结果，在 predict 结果的索引 3 中我应该看到 0。是这样吗？ I am trying to understand how using both predict and predict_proba on the same model relate to each other.我试图了解在同一个predict_proba上同时使用predict和 predict_proba 是如何相互关联的。

Answer 1

predict() is used to predict the actual class (In your case one of 0 , 1 or 1 ). predict()用于预测实际的 class （在您的情况下为0 、 1或1之一）。
predict_proba() is used to predict the class probabilities predict_proba()用于预测class 概率

From the example output that you shared,从您分享的示例 output 中，

predict() would output class 0 since the class probability for 0 is 0.6. predict()将 output class 0因为0的 class 概率为 0.6。
[0.6, 0.2, 0.2] is the output of predict_proba that simply denotes that the class probability for classes 0 , 1 and 1 are 0.6 , 0.2 and 0.2 respectively. [ 0.6 , 0.2 [0.6, 0.2, 0.2]是 predict_proba 的predict_proba简单地表示类别0和1的1概率分别为 0.6、0.2 和0.2 。

Now as the documentation mentions for predict_proba , the resulting array is ordered based on the labels you've been using:现在，正如文档中提到的predict_proba ，结果数组是根据您一直使用的标签排序的：

The returned estimates for all classes are ordered by the label of classes.所有类的返回估计值按类的 label 排序。

Therefore in your case where your class labels are [0, 1, 2] the corresponding output of predict_proba will contain the corresponding probabilities.因此，在您的 class 标签为[0, 1, 2]的情况下， predict_proba 的相应predict_proba将包含相应的概率。 0.6 is the probability of the instance to be classified as 0 and 0.2 are the probabilities that the instance is categorised as 1 and 2 respectively. 0.6是实例被分类为0的概率， 0.2是实例分别被分类为1和2的概率。

scikit-learn 中 predict 与 predict_proba 之间的差异

问题描述

1 个解决方案

解决方案1
4 已采纳 2020-04-13 10:03:16

scikit-learn 中 predict 与 predict_proba 之间的差异

问题描述

1 个解决方案

解决方案1 4 已采纳 2020-04-13 10:03:16

解决方案1
4 已采纳 2020-04-13 10:03:16