如何将 sklearn 预处理器 fit_transform 与 pandas.groupby.transform 一起使用

Question

How to use sklearn preprocessing fit.transform() with pandas.groupby.transform?如何将 sklearn 预处理 fit.transform() 与 pandas.groupby.transform 一起使用？

I used this code here that works:我在这里使用了这个有效的代码：

Picture of sample dataframe示例数据框的图片

df.groupby('Category')['X1'].transform(lambda x: minmax_scale(x.astype(float)))

But when I changed it to the MinMaxScaler() method below, it returns error但是当我将其更改为下面的 MinMaxScaler() 方法时，它返回错误

Code with Error when using .fit_transform method使用 .fit_transform 方法时出错的代码

Assume the table only has 2 columns: Category and X1假设表只有 2 列：Category 和 X1

df.groupby('Category')['X1'].transform(lambda x: MinMaxScaler().fit_transform(x.values.reshape(-1,1)))

Error Message:错误信息：

Data must be 1-dimensional数据必须是一维的

However, if I don't use the .values.reshape(-1,1) it will say但是，如果我不使用 .values.reshape(-1,1) 它会说

Expected 2D array, got 1D array instead.预期的二维数组，改为一维数组。 Reshape your data either using array.reshape(-1, 1) if your data has a single feature如果您的数据具有单个特征，则使用 array.reshape(-1, 1) 重塑您的数据

Are we not supposed to use the fit_transform method for .apply / .transform on pandas?我们不应该在熊猫上对 .apply / .transform 使用 fit_transform 方法吗？

Edit: updated with new error message编辑：更新了新的错误消息

Answer 1

You got to use MinMaxScaler object instance (add parenthses).您必须使用 MinMaxScaler 对象实例（添加括号）。 Try this:尝试这个：

lambda x: MinMaxScaler().fit_transform(x.values.reshape(-1,1))

if you want to pass the scaling range, pass it to the constructor:如果要传递缩放范围，请将其传递给构造函数：

lambda x: MinMaxScaler(feature_range=(0, 10)).fit_transform(x.values.reshape(-1,1))

here is a working example:这是一个工作示例：

df = pd.DataFrame (np.random.randint(1,100,(10)),columns = ['a'])
df['a'].transform(lambda x: MinMaxScaler(feature_range=(0, 10)).
                  fit_transform(x.values.reshape(-1,1)))

array([[ 0.        ],
       [ 6.55172414],
       [ 9.88505747],
       [ 6.09195402],
       [ 1.26436782],
       [ 8.62068966],
       [ 6.43678161],
       [ 5.74712644],
       [ 5.17241379],
       [10.        ]])

Answer 2

I just found the solution, which is to wrap the scaler with np.concatenate() Solution is similar to this thread here: Pandas groupby in combination with sklean preprocessing continued我刚刚找到了解决方案，即用 np.concatenate() 包装缩放器解决方案类似于这里的线程： Pandas groupby in combine with sklean preprocessing continue

So the working code looks like this:所以工作代码如下所示：

df.groupby('Category')['X1'].transform(
lambda x: np.concatenate(StandardScaler().fit_transform(x.values.reshape(-1,1))))

如何将 sklearn 预处理器 fit_transform 与 pandas.groupby.transform 一起使用

问题描述

2 个解决方案

解决方案1
0 2020-01-26 08:02:26

解决方案2
0 2020-01-26 21:50:45

如何将 sklearn 预处理器 fit_transform 与 pandas.groupby.transform 一起使用

问题描述

2 个解决方案

解决方案1 0 2020-01-26 08:02:26

解决方案2 0 2020-01-26 21:50:45

解决方案1
0 2020-01-26 08:02:26

解决方案2
0 2020-01-26 21:50:45