[英]Deducting the median from each column
I have a dataframe, df
with numbers, like so: 我有一个数据框,带有数字的df
,如下所示:
1 1 1
2 1 1
2 1 3
I'd like to deduct the median from each column so that the median of each becomes 0. 我想从每列中扣除中位数,以使每列的中位数变为0。
-1 0 0
0 0 0
0 0 2
How do I do this in a pythandic way? 我该如何以热疗的方式进行呢? I'm guessing it is possible without iterating over the values, computing the median and then deducting. 我猜想有可能不对值进行迭代,计算中值然后扣除。 I'd like to do it tersely, approximately like so: 我想简洁地做一下,大概是这样的:
from numpy import median
df -= median(df) #does not work, deducts median for whole dataframe
Just like this 像这样
df -= df.median(axis=0)
median
of numpy
computes median of overall data. median
的numpy
计算整体数据的中位数 。 To accomplish using numpy
, try this code instead. 要完成使用numpy
,请尝试使用此代码。
df -= median(df, axis=0)
for more detail, see the document: http://docs.scipy.org/doc/numpy/reference/generated/numpy.median.html 有关更多详细信息,请参见文档: http : //docs.scipy.org/doc/numpy/reference/genic/numpy.median.html
Some testing in ipython showed: ipython中的一些测试表明:
In [23]: A = numpy.arange(9)
In [24]: B = A.reshape((3,3))
In [25]: C = numpy.median(B,axis=0)
In [26]: D = B - C[None,:]
In [27]: B
Out[27]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [28]: D
Out[28]:
array([[-3., -3., -3.],
[ 0., 0., 0.],
[ 3., 3., 3.]])
In [29]: C
Out[29]: array([ 3., 4., 5.])
So the next line gets the median along the columns 因此,下一行获取沿列的中位数
C = numpy.median(B,axis=0)
And the next line subtracts it from the matrix, column by column 然后下一行逐行从矩阵中减去
D = B - C[None,:]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.