Calculate standard deviation for groups of values using Python

Question

My data looks similar to this:

index name number difference
0     AAA  10     0
1     AAA  20     10
2     BBB  1      0
3     BBB  2      1
4     CCC  5      0
5     CCC  10     5
6     CCC  10.5   0.5

I need to calculate standard deviation for difference column based on groups of name.

I tried

data[['difference']].groupby(['name']).agg(['mean', 'std'])

and

data["std"]=(data['difference'].groupby('name').std())

but both gave KeyError for the variable that's passed to groupby() . I tried to resolve it with:

data.columns = data.columns.str.strip()

but the error persists.

Thanks in advance.

Answer 1

You can use groupby(['name']) on the full data frame first, and only apply the agg on the columns of interest:

data = pd.DataFrame({'name':['AAA','AAA','BBB','BBB','CCC','CCC','CCC'],
                    'number':[10,20,1,2,5,10,10.5],
                    'difference':[0,10,0,1,0,5,0.5]})
data.groupby(['name'])['difference'].agg(['mean', 'std'])

Calculate standard deviation for groups of values using Python

Question

1 answers

solution1
2 ACCPTED 2022-01-03 17:24:06

Calculate standard deviation for groups of values using Python

Question

1 answers

solution1 2 ACCPTED 2022-01-03 17:24:06

solution1
2 ACCPTED 2022-01-03 17:24:06