简体   繁体   English

Scikit学习减少机器尺寸

[英]Scikit-learn Imputer Reducing Dimensions

I have a dataframe with 332 columns. 我有一个332列的数据框。 I want to impute values to be able to use scikit-learn's decision tree classifier. 我想估算值以便能够使用scikit-learn的决策树分类器。 My problem is that the column of the resulting data from imputer function is only 330. 我的问题是,来自imputer函数的结果数据的列仅为330。

from sklearn.preprocessing import Imputer
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
cols = data.columns
new = imp.fit_transform(data)

print(data.shape,new.shape)
(34132, 332) (34132, 330)

According to the documentation of sklearn.preprocessing.Imputer : 根据sklearn.preprocessing.Imputer文档

When axis=0, columns which only contained missing values at fit are discarded upon transform. 当axis = 0时,仅包含适合的缺失值的列在转换时将被丢弃。

So, this is removing all-missing-value columns. 因此,这将删除所有缺失值列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM