[英]Scikit-learn Imputer Reducing Dimensions
I have a dataframe with 332 columns. 我有一个332列的数据框。 I want to impute values to be able to use scikit-learn's decision tree classifier.
我想估算值以便能够使用scikit-learn的决策树分类器。 My problem is that the column of the resulting data from imputer function is only 330.
我的问题是,来自imputer函数的结果数据的列仅为330。
from sklearn.preprocessing import Imputer
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
cols = data.columns
new = imp.fit_transform(data)
print(data.shape,new.shape)
(34132, 332) (34132, 330)
According to the documentation of sklearn.preprocessing.Imputer
: 根据
sklearn.preprocessing.Imputer
的文档 :
When axis=0, columns which only contained missing values at fit are discarded upon transform.
当axis = 0时,仅包含适合的缺失值的列在转换时将被丢弃。
So, this is removing all-missing-value columns. 因此,这将删除所有缺失值列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.