[英]How to normalize 2D array with sklearn?
Given a 2D array, I would like to normalize it into range 0-1.给定一个二维数组,我想将其标准化为 0-1 范围。
I know this can be achieve as below我知道这可以实现如下
import numpy as np
from sklearn.preprocessing import normalize,MinMaxScaler
np.random.seed(0)
t_feat=4
t_epoch=3
t_wind=2
result = [np.random.rand(t_epoch, t_feat) for _ in range(t_wind)]
wdw_epoch_feat=np.array(result)
matrix=wdw_epoch_feat[:,:,0]
xmax, xmin = matrix.max(), matrix.min()
x_norm = (matrix - xmin)/(xmax - xmin)
which produce产生
[[0.55153917 0.42094786 0.98439526], [0.57160496 0. 1. ]]
However, I cannot get the same result using the MinMaxScaler
of sklearn
但是,我无法使用MinMaxScaler
的sklearn
获得相同的结果
scaler = MinMaxScaler()
x_norm = scaler.fit_transform(matrix)
which produce产生
[[0. 1. 0.], [1. 0. 1.]]
Appreciate for any thought感谢任何想法
You are standardizing the entire matrix.您正在标准化整个矩阵。 MinMaxScaler is designed for machine learning, thus performs standardization per row or column based on how you define it. MinMaxScaler 专为机器学习而设计,因此根据您的定义方式对每行或每列执行标准化。 To get the same results as you, you would need to turn the 2D array into a 1D array.要获得与您相同的结果,您需要将 2D 数组转换为 1D 数组。 I show this below and get your same results in the first column:我在下面展示并在第一列中得到相同的结果:
import numpy as np
from sklearn.preprocessing import normalize, MinMaxScaler
np.random.seed(0)
t_feat=4
t_epoch=3
t_wind=2
result = [np.random.rand(t_epoch, t_feat) for _ in range(t_wind)]
wdw_epoch_feat=np.array(result)
matrix=wdw_epoch_feat[:,:,0]
xmax, xmin = matrix.max(), matrix.min()
x_norm = (matrix - xmin)/(xmax - xmin)
matrix = np.array([matrix.flatten(), np.random.rand(len(matrix.flatten()))]).T
scaler = MinMaxScaler()
test = scaler.fit_transform(matrix)
print(test)
-------------------------------------------
[[0.55153917 0. ]
[0.42094786 0.63123194]
[0.98439526 0.03034732]
[0.57160496 1. ]
[0. 0.48835502]
[1. 0.35865137]]
When you use MinMaxScaler for Machine Learning, you generally want to standardize each column.当您使用 MinMaxScaler 进行机器学习时,您通常希望标准化每一列。
A clever way to do this would be to reshape your data to 1D, apply transform and reshape it back to original -一个聪明的方法是将您的数据重塑为一维数据,应用变换并将其重塑回原始数据 -
import numpy as np
X = np.array([[-1, 2], [-0.5, 6]])
scaler = MinMaxScaler()
X_one_column = X.reshape([-1,1])
result_one_column = scaler.fit_transform(X_one_column)
result = result_one_column.reshape(X.shape)
print(result)
[[ 0. 0.42857143]
[ 0.07142857 1. ]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.