科学工具分类阈值

Question

So I'm using scikit-learn to do some binary classification, and right now I'm trying the Logistic Regression classifier. 因此，我正在使用scikit-learn进行一些二进制分类，现在我正在尝试使用Logistic回归分类器。 After training the classifier, I print out the classification results and the probabilities they are in each class: 在训练了分类器之后，我打印出分类结果以及它们在每个班级中的概率：

logreg = LogisticRegression()
logreg.fit(X_train,y_train)
print logreg.predict(X_test)
print logreg.predict_proba(X_test)

and so I get something like: 所以我得到类似：

[-1 1 1 -1 1 -1...-1]
[[  8.64625237e-01   1.35374763e-01]
 [  3.57441028e-01   6.42558972e-01]
 [  1.67970096e-01   8.32029904e-01]
 [  9.20026249e-01   7.99737513e-02]
 [  1.20456011e-02   9.87954399e-01]
 [  6.48565595e-01   3.51434405e-01]...]

etc...and so it looks like whenever the probability exceeds 0.5, that's what the object is classified as. 等等...因此，只要概率超过0.5，就将其归类为该对象。 I'm looking for a way to adjust this number so that, for example, the probability to be in class 1 must exceed .7 to be classified as such. 我正在寻找一种调整此数字的方法，例如，要被归类为1级，出现这种情况的概率必须超过0.7。 Is there a way to do this? 有没有办法做到这一点？ I was looking at some parameters already like 'tol' and 'weight' but I wasn't sure if they were what I was looking for or if they were working... 我一直在查看一些参数，例如“ tol”和“ weight”，但不确定它们是否在我想要的范围内，或者它们是否在起作用...

Answer 1

You can set your THRESHOLD like this 您可以像这样设置您的THRESHOLD

THRESHOLD = 0.7
preds = np.where(logreg.predict_proba(X_test)[:,1] > THRESHOLD, 1, 0)

Please refer to sklearn LogisticRegression and changing the default threshold for classification 请参考sklearn LogisticRegression并更改默认分类阈值

科学工具分类阈值

问题描述

1 个解决方案

解决方案1
0 2018-09-10 08:16:04

科学工具分类阈值

问题描述

1 个解决方案

解决方案1 0 2018-09-10 08:16:04

解决方案1
0 2018-09-10 08:16:04