简体   繁体   English

从每列中减去中位数

[英]Deducting the median from each column

I have a dataframe, df with numbers, like so: 我有一个数据框,带有数字的df ,如下所示:

1 1 1
2 1 1
2 1 3

I'd like to deduct the median from each column so that the median of each becomes 0. 我想从每列中扣除中位数,以使每列的中位数变为0。

-1 0 0
0 0 0
0 0 2

How do I do this in a pythandic way? 我该如何以热疗的方式进行呢? I'm guessing it is possible without iterating over the values, computing the median and then deducting. 我猜想有可能不对值进行迭代,计算中值然后扣除。 I'd like to do it tersely, approximately like so: 我想简洁地做一下,大概是这样的:

from numpy import median
df -= median(df) #does not work, deducts median for whole dataframe

Just like this 像这样

df -= df.median(axis=0)

median of numpy computes median of overall data. mediannumpy计算整体数据的中位数 To accomplish using numpy , try this code instead. 要完成使用numpy ,请尝试使用此代码。

df -= median(df, axis=0)

for more detail, see the document: http://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html 有关更多详细信息,请参见文档: http : //docs.scipy.org/doc/numpy/reference/genic/numpy.median.html

Some testing in ipython showed: ipython中的一些测试表明:

In [23]: A = numpy.arange(9)

In [24]: B = A.reshape((3,3))

In [25]: C = numpy.median(B,axis=0)

In [26]: D = B - C[None,:]

In [27]: B
Out[27]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [28]: D
Out[28]: 
array([[-3., -3., -3.],
       [ 0.,  0.,  0.],
       [ 3.,  3.,  3.]])
In [29]: C
Out[29]: array([ 3.,  4.,  5.])

So the next line gets the median along the columns 因此,下一行获取沿列的中位数

C = numpy.median(B,axis=0)

And the next line subtracts it from the matrix, column by column 然后下一行逐行从矩阵中减去

D = B - C[None,:]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM