简体   繁体   English

如何使用StackingClassifier + Logistic回归(二进制分类)查找系数的特征名称

[英]How to find the features names of the coefficients using StackingClassifier + Logistic Regression (binary classification)

I am trying to use StackingClassifier with Logistic regression (Binary Classifier). 我正在尝试将StackingClassifier与Logistic回归(二进制分类器)一起使用。 Sample code: 样例代码:

from sklearn.datasets import load_iris
from mlxtend.classifier import StackingClassifier
from sklearn.linear_model import LogisticRegression


iris = load_iris()
X = iris.data
y = iris.target

y[y == 2] = 1 #Make it binary classifier

LR1 = LogisticRegression(penalty='l1')
LR2 = LogisticRegression(penalty='l1')
LR3 = LogisticRegression(penalty='l1')
LR4 = LogisticRegression(penalty='l1')
LR5 = LogisticRegression(penalty='l1')


clfs1= [LR1, LR2]
clfs2= [LR3, LR4, LR5]

cls_=[]
cls_.append(clfs1)
cls_.append(clfs2)

sclf = StackingClassifier(classifiers=sum(cls_,[]), 
    meta_classifier=LogisticRegression(penalty='l1'), use_probas=True, average_probas=False)

sclf.fit(X, y)

sclf.meta_clf_.coef_ #give the weight values

For each classifier, Initial logistic regression gives a probability value for two classes. 对于每个分类器,初始逻辑回归给出两个类别的概率值。 As I am using stacking 5 classifiers, sclf.meta_clf_.coef_ gives 10 weights values. 当我使用堆叠5个分类器时, sclf.meta_clf_.coef_给出10个权重值。

array([[-0.96815163, 1.25335525, -0.03120535, 0.8533569 , -2.6250897 , 1.98034805, -0.361378 , 0.00571954, -0.03206343, 0.53138651]]) 数组([[-0.96815163,1.25335525,-0.03120535,0.8533569,-2.6250897,1.98034805,-0.361378,0.00571954,-0.03206343,0.53138651]])

I am confused about the order of weight values. 我对权重值的顺序感到困惑。 means 手段

  • Are the 1st two values (-0.96815163, 1.25335525) for first logistic regression LR1 ? 第一次逻辑回归LR1的第一个两个值(-0.96815163, 1.25335525)吗?

  • Are the 2nd two values (-0.03120535, 0.8533569) for first logistic regression LR2 ? 第一次逻辑回归LR2的第二个两个值(-0.03120535, 0.8533569)吗?

I want to find out which values are for which Logistic Regression (LR) for the stacking classifier. 我想找出用于堆栈分类器的哪个Logistic回归(LR)的值。

Please Help. 请帮忙。

If your output is: 如果您的输出是:

array([[-0.96815163, 1.25335525, -0.03120535, 0.8533569 , -2.6250897 , 1.98034805, -0.361378 , 0.00571954, -0.03206343, 0.53138651]]) 数组([[-0.96815163,1.25335525,-0.03120535,0.8533569,-2.6250897,1.98034805,-0.361378,0.00571954,-0.03206343,0.53138651]])

Then, 然后,

-0.96815163, 1.25335525: the probability of 0 and 1 for LR1 -0.96815163、1.25335525:LR1的概率为0和1

-0.03120535, 0.8533569: the probability of 0 and 1 for LR2 -0.03120535、0.8533569:LR2的概率为0和1

-2.6250897, 1.98034805: the probability of 0 and 1 for LR3 -2.6250897,1.98034805:LR3的概率为0和1

-0.361378, 0.00571954: the probability of 0 and 1 for LR4 -0.361378、0.00571954:LR4的概率为0和1

-0.03206343, 0.53138651: the probability of 0 and 1 for LR5 -0.03206343、0.53138651:LR5的概率为0和1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM