如何用 sklearn 标准化二维数组？

Question

Given a 2D array, I would like to normalize it into range 0-1.给定一个二维数组，我想将其标准化为 0-1 范围。

I know this can be achieve as below我知道这可以实现如下

import numpy as np
from sklearn.preprocessing import normalize,MinMaxScaler

np.random.seed(0)
t_feat=4
t_epoch=3
t_wind=2

result = [np.random.rand(t_epoch, t_feat) for _ in range(t_wind)]
wdw_epoch_feat=np.array(result)
matrix=wdw_epoch_feat[:,:,0]

xmax, xmin = matrix.max(), matrix.min()
x_norm = (matrix - xmin)/(xmax - xmin)

which produce产生

[[0.55153917 0.42094786 0.98439526], [0.57160496 0.         1.        ]]

However, I cannot get the same result using the MinMaxScaler of sklearn但是，我无法使用MinMaxScaler的sklearn获得相同的结果

scaler = MinMaxScaler()
x_norm = scaler.fit_transform(matrix)

which produce产生

[[0. 1. 0.], [1. 0. 1.]]

Appreciate for any thought感谢任何想法

Answer 1

You are standardizing the entire matrix.您正在标准化整个矩阵。 MinMaxScaler is designed for machine learning, thus performs standardization per row or column based on how you define it. MinMaxScaler 专为机器学习而设计，因此根据您的定义方式对每行或每列执行标准化。 To get the same results as you, you would need to turn the 2D array into a 1D array.要获得与您相同的结果，您需要将 2D 数组转换为 1D 数组。 I show this below and get your same results in the first column:我在下面展示并在第一列中得到相同的结果：

import numpy as np
from sklearn.preprocessing import normalize, MinMaxScaler

np.random.seed(0)
t_feat=4
t_epoch=3
t_wind=2

result = [np.random.rand(t_epoch, t_feat) for _ in range(t_wind)]
wdw_epoch_feat=np.array(result)
matrix=wdw_epoch_feat[:,:,0]

xmax, xmin = matrix.max(), matrix.min()
x_norm = (matrix - xmin)/(xmax - xmin)


matrix = np.array([matrix.flatten(), np.random.rand(len(matrix.flatten()))]).T
scaler = MinMaxScaler() 
test  = scaler.fit_transform(matrix)

print(test)
-------------------------------------------
[[0.55153917 0.        ]
 [0.42094786 0.63123194]
 [0.98439526 0.03034732]
 [0.57160496 1.        ]
 [0.         0.48835502]
 [1.         0.35865137]]

When you use MinMaxScaler for Machine Learning, you generally want to standardize each column.当您使用 MinMaxScaler 进行机器学习时，您通常希望标准化每一列。

Answer 2

A clever way to do this would be to reshape your data to 1D, apply transform and reshape it back to original -一个聪明的方法是将您的数据重塑为一维数据，应用变换并将其重塑回原始数据 -

import numpy as np

X = np.array([[-1, 2], [-0.5, 6]])
scaler = MinMaxScaler()
X_one_column = X.reshape([-1,1])
result_one_column = scaler.fit_transform(X_one_column)
result = result_one_column.reshape(X.shape)
print(result)

[[ 0.          0.42857143]
 [ 0.07142857  1.        ]]

如何用 sklearn 标准化二维数组？

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-01-30 04:26:06

解决方案2
1 2021-01-30 04:30:47

如何用 sklearn 标准化二维数组？

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-01-30 04:26:06

解决方案2 1 2021-01-30 04:30:47

解决方案1
1 已采纳 2021-01-30 04:26:06

解决方案2
1 2021-01-30 04:30:47