如何在熊猫中找到多列的非零中位数/平均值？

Question

I have a long list of columns for which I want to calculate non-zero median,mean & std in a one go. 我有一长列想要一次性计算非零中位数，平均值和标准差的列。 I cannot just delete rows with 0 based on 1 column because the value for another column in same column may not be 0. 我不能只删除基于1列的0行，因为同一列中另一列的值可能不是0。

Below is the code I currently have which calculates median,mean etc. including zero. 下面是我目前拥有的计算中位数，均值等（包括零）的代码。

    agg_list_oper={'ABC1':[max,np.std,np.mean,np.median],
    'ABC2':[max,np.std,np.mean,np.median],
    'ABC3':[max,np.std,np.mean,np.median],
    'ABC4':[max,np.std,np.mean,np.median],
.....
.....
.....
    }

    df=df_tmp.groupby(['id']).agg(agg_list_oper).reset_index()

I know I can write long code with loops to process one column at a time. 我知道我可以编写带有循环的长代码来一次处理一列。 Is there a way to do this in pandas groupby.agg() or some other functions elegantly? 有没有办法在pandas groupby.agg（）或其他一些函数中做到这一点？

Answer 1

You can temporarily replace 0's with NaNs. 您可以用NaN临时替换0。 Then, pandas will ignore the NaNs while calculating medians. 然后，熊猫在计算中位数时会忽略NaN。

df_tmp.replace(0, np.nan).groupby(['id']).agg(agg_list_oper).reset_index()

如何在熊猫中找到多列的非零中位数/平均值？

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-08-18 11:01:06

如何在熊猫中找到多列的非零中位数/平均值？

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-08-18 11:01:06

解决方案1
2 已采纳 2016-08-18 11:01:06