My data looks similar to this:
index name number difference
0 AAA 10 0
1 AAA 20 10
2 BBB 1 0
3 BBB 2 1
4 CCC 5 0
5 CCC 10 5
6 CCC 10.5 0.5
I need to calculate standard deviation for difference column based on groups of name.
I tried
data[['difference']].groupby(['name']).agg(['mean', 'std'])
and
data["std"]=(data['difference'].groupby('name').std())
but both gave KeyError for the variable that's passed to groupby()
. I tried to resolve it with:
data.columns = data.columns.str.strip()
but the error persists.
Thanks in advance.
You can use groupby(['name'])
on the full data frame first, and only apply the agg on the columns of interest:
data = pd.DataFrame({'name':['AAA','AAA','BBB','BBB','CCC','CCC','CCC'],
'number':[10,20,1,2,5,10,10.5],
'difference':[0,10,0,1,0,5,0.5]})
data.groupby(['name'])['difference'].agg(['mean', 'std'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.