简体   繁体   English

numpy数组缩放未返回正确的值

[英]Numpy array scaling not returning proper values

I have a numpy array that I want to alter by scaling all of the columns (eg all the values in a column are divided by the maximum value in that column so that all values are <1). 我有一个numpy数组,我想通过缩放所有列来更改(例如,一列中的所有值都除以该列中的最大值,以便所有值均小于1)。

A sample output of the array is 数组的示例输出是

[ 2. 0. 367.877 ..., -0.358 51.547 -32.633] [2. 0. 367.877 ...,-0.358 51.547 -32.633]

[ 2. 0. 339.824 ..., -0.33 52.562 -27.581] [2. 0. 339.824 ...,-0.33 52.562 -27.581]

[ 3. 0. 371.438 ..., -0.406 55.108 -35.573] [3. 0. 371.438 ...,-0.406 55.108 -35.573]

I've tried scaling the array (data_in) by the following code: 我尝试通过以下代码缩放数组(data_in):

#normalize the data_in array 
data_in_normalized = data_in / data_in.max(axis=0)

However, the output of data_in_normalized is: 但是,data_in_normalized的输出为:

[ 0.5 0. 0.95437199 0.89363654 0.80751792 ] [0.5 0. 0.95437199 0.89363654 0.80751792]

[ 0.46931238 0.50660904 0.5003812 0.91250444 0.625 ] [0.46931238 0.50660904 0.5003812 0.91250444 0.625]

[ 0.96229214 0.89483109 0.86989432 0.86491407 0.71287646 ] [0.96229214 0.89483109 0.86989432 0.86491407 0.71287646]

[ -23.90909091 0.34346373 1.25110652 0. 0.8537859 1. 1.] [-23.90909091 0.34346373 1.25110652 0. 0.8537859 1. 1.]

Clearly, it didn't normalize--there are multiple areas where the maximum value is >1. 显然,它没有规范化-在多个区域中,最大值> 1。 Is there a better way to scale the data, or am I using the max() function incorrectly (eg is the max() value being shared between columns?) 有没有更好的方法来缩放数据,还是我使用了不正确的max()函数(例如,max()值是否在列之间共享?)

IIUC, it's not that the maximum value is shared between columns, it's that you probably want to divide by the maximum absolute value instead, because you have elements of both signs. IIUC,不是最大值在列之间共享,而是您可能想除以最大绝对值 ,因为您有两个符号的元素。 1 > -100, after all, and so if you divide by the maximum value of a column with [1, -100], nothing would change. 1> -100毕竟,因此,如果用[1,-100]除以一列的最大值 ,则什么都不会改变。

For example: 例如:

>>> data_in = np.array([[-3,-2],[2,1]])
>>> data_in
array([[-3, -2],
       [ 2,  1]])
>>> data_in.max(axis=0)
array([2, 1])
>>> data_in / data_in.max(axis=0)
array([[-1.5, -2. ],
       [ 1. ,  1. ]])

but

>>> data_in / np.abs(data_in).max(axis=0)
array([[-1.        , -1.        ],
       [ 0.66666667,  0.5       ]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM