Scikit学习减少机器尺寸

Question

I have a dataframe with 332 columns. 我有一个332列的数据框。 I want to impute values to be able to use scikit-learn's decision tree classifier. 我想估算值以便能够使用scikit-learn的决策树分类器。 My problem is that the column of the resulting data from imputer function is only 330. 我的问题是，来自imputer函数的结果数据的列仅为330。

from sklearn.preprocessing import Imputer
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
cols = data.columns
new = imp.fit_transform(data)

print(data.shape,new.shape)
(34132, 332) (34132, 330)

Answer 1

According to the documentation of sklearn.preprocessing.Imputer : 根据sklearn.preprocessing.Imputer的文档：

When axis=0, columns which only contained missing values at fit are discarded upon transform. 当axis = 0时，仅包含适合的缺失值的列在转换时将被丢弃。

So, this is removing all-missing-value columns. 因此，这将删除所有缺失值列。

Scikit学习减少机器尺寸

问题描述

1 个解决方案

解决方案1
3 2016-08-11 19:03:57

Scikit学习减少机器尺寸

问题描述

1 个解决方案

解决方案1 3 2016-08-11 19:03:57

解决方案1
3 2016-08-11 19:03:57