简体   繁体   English

共线特征及其对线性模型的影响,任务:1 Logistic Regression

[英]Collinear features and their effect on linear models,Task: 1 Logistic Regression

%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import GridSearchCV
import seaborn as sns
import matplotlib.pyplot as plt`enter code here`
data = pd.read_csv('task_d.csv')
data.head()

output output

         x         y           z          x*x      2*y           2*z+3*x*x     w        target
0   -0.581066   0.841837    -1.012978   -0.604025   0.841837    -0.665927   -0.536277   0
1   -0.894309   -0.207835   -1.012978   -0.883052   -0.207835   -0.917054   -0.522364   0
2   -1.207552   0.212034    -1.082312   -1.150918   0.212034    -1.166507   0.205738    0
3   -1.364174   0.002099    -0.943643   -1.280666   0.002099    -1.266540   -0.665720   0
4   -0.737687   1.051772    -1.012978   -0.744934   1.051772    -0.792746   -0.735054   0
X = data.drop(['target'], axis=1).values
Y = data['target'].values

Doing perturbation test to check the presence of collinearity Task: 1 Logistic Regression¶进行扰动测试以检查共线性的存在任务:1 逻辑回归¶

data.corr()['target']

output output

x            0.728290
y           -0.690684
z            0.969990
x*x          0.719570
2*y         -0.690684
2*z+3*x*x    0.764729
w            0.641750
target       1.000000
Name: target, dtype: float64

corr = X.corr()
ax = sns.heatmap(corr,vmin=-1, vmax=1, center=0,cmap=sns.diverging_palette(20, 220, n=200),square=True)
ax.set_xticklabels(ax.get_xticklabels(),rotation=45,horizontalalignment='right');

output output

AttributeError                            Traceback (most recent call last)
<ipython-input-42-749cdea8ad1a> in <module>
      1 ##correlation matrix using seaborn heatmap##https://towardsdatascience.com/better-heatmapscorr = X.corr()
----> 2 corr = X.corr()
      3 ax = sns.heatmap(corr,vmin=-1, vmax=1, center=0,cmap=sns.diverging_palette(20, 220, n=200),square=True)
      4 ax.set_xticklabels(ax.get_xticklabels(),rotation=45,horizontalalignment='right');

AttributeError: 'numpy.ndarray' object has no attribute 'corr'

How can I fix this?我怎样才能解决这个问题?

Why did you use.values() when creating X?为什么在创建 X 时使用.values()? That returns a numpy array.这将返回一个 numpy 数组。

If you remove the.values(), your X will remain a pandas DataFrame, which has the.corr() method.如果您删除 .values(),您的 X 将保留为 pandas DataFrame,它具有 .corr() 方法。 Then your code will run as you intended.然后您的代码将按您的预期运行。

corr = X.corr() ax = sns.heatmap(corr,vmin=-1, vmax=1, center=0,cmap=sns.diverging_palette(20, 220, n=200),square=True) ax.set_xticklabels(ax.get_xticklabels(),rotation=45,horizontalalignment='right'); corr = X.corr() ax = sns.heatmap(corr,vmin=-1, vmax=1, center=0,cmap=sns.diverging_palette(20, 220, n=200),square=True) ax.set_xticklabels (ax.get_xticklabels(),rotation=45,horizontalalignment='right');

instead of X call directly to the dataset而不是 X 直接调用数据集

this will help corr = data.corr() ax = sns.heatmap(corr,vmin=-1, vmax=1, center=0,cmap=sns.diverging_palette(20, 220, n=200),square=True) ax.set_xticklabels(ax.get_xticklabels(),rotation=45,horizontalalignment='right');这将有助于 corr = data.corr() ax = sns.heatmap(corr,vmin=-1, vmax=1, center=0,cmap=sns.diverging_palette(20, 220, n=200),square=True) ax.set_xticklabels(ax.get_xticklabels(),rotation=45,horizontalalignment='right');

#python #machinelearning #python #机器学习

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM