[英]scikit-learn MinMaxScaler produces slightly different results than a NumPy implemantation
I compared the scikit-learn Min-Max scaler from its preprocessing
module with a "manual" approach using NumPy. 我将其
preprocessing
模块中的scikit-learn Min-Max缩放器与使用NumPy的“手动”方法进行了比较。 However, I noticed that the result is slightly different. 但是,我注意到结果略有不同。 Does anyone have a explanation for this?
有没有人对此有解释?
Using the following equation for Min-Max scaling: 使用以下等式进行最小 - 最大缩放:
which should be the same as the scikit-learn one: (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
它应与scikit-learn one相同:
(X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
I am using both approaches as follows: 我使用两种方法如下:
def numpy_minmax(X):
xmin = X.min()
return (X - xmin) / (X.max() - xmin)
def sci_minmax(X):
minmax_scale = preprocessing.MinMaxScaler(feature_range=(0, 1), copy=True)
return minmax_scale.fit_transform(X)
On a random sample: 在随机样本上:
import numpy as np
np.random.seed(123)
# A random 2D-array ranging from 0-100
X = np.random.rand(100,2)
X.dtype = np.float64
X *= 100
The results are slightly different: 结果略有不同:
from matplotlib import pyplot as plt
sci_mm = sci_minmax(X)
numpy_mm = numpy_minmax(X)
plt.scatter(numpy_mm[:,0], numpy_mm[:,1],
color='g',
label='NumPy bottom-up',
alpha=0.5,
marker='o'
)
plt.scatter(sci_mm[:,0], sci_mm[:,1],
color='b',
label='scikit-learn',
alpha=0.5,
marker='x'
)
plt.legend()
plt.grid()
plt.show()
scikit-learn
processes each feature individually. scikit-learn
处理每个功能。 So, you need to specify axis=0
when taking min
, otherwise numpy.min
would be the min on all the elements of the array, not each column separately: 所以,你需要在取
min
时指定axis=0
,否则numpy.min
将是数组所有元素的min,而不是每个列分别:
>>> xs
array([[1, 2],
[3, 4]])
>>> xs.min()
1
>>> xs.min(axis=0)
array([1, 2])
same thing for numpy.max
; numpy.max
; so the correct function would be: 所以正确的功能是:
def numpy_minmax(X):
xmin = X.min(axis=0)
return (X - xmin) / (X.max(axis=0) - xmin)
Doing so you will get an exact match: 这样做你将获得完全匹配:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.