In pandas, how to compute statistics of change across groups

Question

In pandas, what is the right way to compute statistics of change between groups, for example:

df = pd.DataFrame({
    'a' : [0,0,0,0,1,1,1,1] * 2,
    'b' : [0,0,1,1,0,0,1,1] * 2,
    'mode': ['baseline','test'] * 8,
    'value' : np.random.uniform(0,1,16)
})

I can compute the group means via:

df.groupby(['a','b','mode']).mean()

If I want to make a new dataframe where the columns are a and b, and then a column representing the difference between the mean "baseline" and "test" value, how do I that?

Answer 1

Do you mean something like this:

new_df = (df.groupby(['a','b','mode'])['value'].mean().unstack())

out = (new_df['baseline'] - new_df['test'])

Output:

a  b
0  0   -0.326467
   1    0.022417
1  0    0.428539
   1    0.019359
dtype: float64

In pandas, how to compute statistics of change across groups

Question

1 answers

solution1
2 ACCPTED 2020-09-24 16:18:38

In pandas, how to compute statistics of change across groups

Question

1 answers

solution1 2 ACCPTED 2020-09-24 16:18:38

solution1
2 ACCPTED 2020-09-24 16:18:38