简体   繁体   中英

In pandas, how to compute statistics of change across groups

In pandas, what is the right way to compute statistics of change between groups, for example:

df = pd.DataFrame({
    'a' : [0,0,0,0,1,1,1,1] * 2,
    'b' : [0,0,1,1,0,0,1,1] * 2,
    'mode': ['baseline','test'] * 8,
    'value' : np.random.uniform(0,1,16)
})

I can compute the group means via:

df.groupby(['a','b','mode']).mean()

If I want to make a new dataframe where the columns are a and b, and then a column representing the difference between the mean "baseline" and "test" value, how do I that?

Do you mean something like this:

new_df = (df.groupby(['a','b','mode'])['value'].mean().unstack())

out = (new_df['baseline'] - new_df['test'])

Output:

a  b
0  0   -0.326467
   1    0.022417
1  0    0.428539
   1    0.019359
dtype: float64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM