简体   繁体   English

熊猫不同行中多列的计算方法

[英]Calculating means for multiple columns, in different rows in pandas

I have a csv file like this: 我有这样的csv文件:

-Species-    -Strain-       -A-       -B-       -C-       -D-
 Species1    Strain1.1         0.2       0.1       0.1       0.4
 Species1    Strain1.1         0.2       0.7       0.2       0.2
 Species1    Strain1.2         0.1       0.6       0.1       0.3
 Species1    Strain1.1         0.2       0.6       0.2       0.6
 Species2    Strain2.1         0.3       0.3       0.3       0.1
 Species2    Strain2.2         0.6       0.2       0.6       0.2
 Species2    Strain2.2         0.2       0.1       0.4       0.2

And I would like to calculate a mean (average) for each unique strain for each of the columns (AD) how would I go about doing it? 我想为每列(AD)的每个唯一应变计算平均值(平均值),我该怎么做呢?

I tried df.groupby(['Strain','Species']).mean().mean(1) but that still seems to give me multiple versions of strains in the resulting dataframe, rather than the means for each columns for each unique strain. 我尝试了df.groupby(['Strain','Species']).mean().mean(1)但这似乎仍使我在结果数据帧中获得了多个版本的应变,而不是为每个列的均值独特的应变。

Essentially I would like a mean result for A,B,C & D per strain. 本质上,我希望每个菌株的A,B,C和D均值。

Apologies for being unclear, I'm struggling to get my head around this, and I'm very new to programming! 很抱歉,不清楚,我正在努力解决这个问题,而且我对编程还很陌生!

IIUC, you simply need to call IIUC,您只需要致电

df.groupby(['Species', 'Strain']).mean()

                      A         B         C    D 
Species   Strain                               
Species1  Strain1.1  0.2  0.466667  0.166667  0.4
          Strain1.2  0.1  0.600000  0.100000  0.3
Species2  Strain2.1  0.3  0.300000  0.300000  0.1
          Strain2.2  0.4  0.150000  0.500000  0.2

What you were doing when you called df.groupby(['Strain','Species']).mean().mean(1) was taking the mean of the 4 means in A , B , C , and D . 当您调用df.groupby(['Strain','Species']).mean().mean(1)时,您正在做的事取ABCD 4个均值的平均值。 mean(1) means take the mean over the first axis ( ie over the columns). mean(1)表示在第一个轴上( 在列上)取平均值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM