In pandas, what is the right way to compute statistics of change between groups, for example:
df = pd.DataFrame({
'a' : [0,0,0,0,1,1,1,1] * 2,
'b' : [0,0,1,1,0,0,1,1] * 2,
'mode': ['baseline','test'] * 8,
'value' : np.random.uniform(0,1,16)
})
I can compute the group means via:
df.groupby(['a','b','mode']).mean()
If I want to make a new dataframe where the columns are a and b, and then a column representing the difference between the mean "baseline" and "test" value, how do I that?
Do you mean something like this:
new_df = (df.groupby(['a','b','mode'])['value'].mean().unstack())
out = (new_df['baseline'] - new_df['test'])
Output:
a b
0 0 -0.326467
1 0.022417
1 0 0.428539
1 0.019359
dtype: float64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.