fit_transform之后的数组大小不同

Question

I have a problem with fit_transform function. 我有fit_transform函数的问题。 Can someone explain why size of array different? 有人可以解释为什么数组的大小不同？

In [5]: X.shape, test.shape

Out[5]: ((1000, 1932), (1000, 1932))

In [6]: from sklearn.feature_selection import VarianceThreshold
        sel = VarianceThreshold(threshold=(.8 * (1 - .8)))
        features = sel.fit_transform(X)
        features_test = sel.fit_transform(test)

In [7]: features.shape, features_test.shape

Out[7]:((1000, 1663), (1000, 1665))

UPD: Which transformation can help me get arrays with same sizes? UPD：哪种转换可以帮助我获得相同大小的数组？

Answer 1

It is because you are fitting your selector twice . 这是因为你适合你的选择器两次 。

First, note that fit_transform is just a call to fit followed by a call to transform . 首先，请注意fit_transform只是一个fit调用，然后是transform调用。

The fit method allows your VarianceThreshold selector to find the features it wants to keep in the dataset based on the parameters you gave it. fit方法允许您的VarianceThreshold选择器根据您给出的参数查找要保留在数据集中的要素。

The transform method performs the actual feature selection and returns an array with just the selected features. transform方法执行实际的特征选择，并返回仅包含所选特征的数组。

Answer 2

Because fit_transform applies a dimensionality reduction on the array. 因为fit_transform对数组应用了fit_transform维。 This is why the resulting arrays dimensions are not the same as the input. 这就是生成的数组维度与输入不同的原因。

See this what is the difference between 'transform' and 'fit_transform' in sklearn and this http://scikit-learn.org/stable/modules/feature_extraction.html 看看sklearn中的'transform'和'fit_transform'之间的区别是什么？ http: //scikit-learn.org/stable/modules/feature_extraction.html

fit_transform之后的数组大小不同

问题描述

2 个解决方案

解决方案1
6 已采纳 2015-08-31 13:52:55

解决方案2
0 2015-08-31 12:56:23

fit_transform之后的数组大小不同

问题描述

2 个解决方案

解决方案1 6 已采纳 2015-08-31 13:52:55

解决方案2 0 2015-08-31 12:56:23

解决方案1
6 已采纳 2015-08-31 13:52:55

解决方案2
0 2015-08-31 12:56:23