规范化2D Numpy数组：零均值单位方差

Question

I have a 2D Numpy array, in which I want to normalise each column to zero mean and unit variance. 我有一个2D Numpy数组，其中我想将每列归一化为零均值和单位方差。 Since I'm primarily used to C++, the method in which I'm doing is to use loops to iterate over elements in a column and do the necessary operations, followed by repeating this for all columns. 由于我主要习惯于C ++，我所使用的方法是使用循环迭代列中的元素并执行必要的操作，然后对所有列重复此操作。 I wanted to know about a pythonic way to do so. 我想知道一个pythonic的方法。

Let class_input_data be my 2D array. 让class_input_data成为我的2D数组。 I can get the column mean as: 我可以得到列的意思是：

column_mean = numpy.sum(class_input_data, axis = 0)/class_input_data.shape[0]

I then subtract the mean from all columns by: 然后我通过以下方法减去所有列的均值：

class_input_data = class_input_data - column_mean

By now, the data should be zero mean. 到目前为止，数据应为零均值。 However, the value of: 但是，价值：

numpy.sum(class_input_data, axis = 0)

isn't equal to 0, implying that I have done something wrong in my normalisation. 不等于0，暗示我在规范化中做错了。 By isn't equal to 0, I don't mean very small numbers which can be attributed to floating point inaccuracies. By不等于0，我不是指可归因于浮点不准确的非常小的数字。

Answer 1

Something like: 就像是：

import numpy as np

eg_array = 5 + (np.random.randn(10, 10) * 2)
normed = (eg_array - eg_array.mean(axis=0)) / eg_array.std(axis=0)

normed.mean(axis=0)
Out[14]: 
array([  1.16573418e-16,  -7.77156117e-17,  -1.77635684e-16,
         9.43689571e-17,  -2.22044605e-17,  -6.09234885e-16,
        -2.22044605e-16,  -4.44089210e-17,  -7.10542736e-16,
         4.21884749e-16])

normed.std(axis=0)
Out[15]: array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

规范化2D Numpy数组：零均值单位方差

问题描述

1 个解决方案

解决方案1
15 已采纳 2015-07-01 05:17:53

规范化2D Numpy数组：零均值单位方差

问题描述

1 个解决方案

解决方案1 15 已采纳 2015-07-01 05:17:53

解决方案1
15 已采纳 2015-07-01 05:17:53