简体   繁体   中英

How group by two columns use pandas

I have 3 columns - _a, _b, _c.

import numpy as np 
import pandas as pd
df = pd.DataFrame({'_a':[1,1,1,2,2,3,3],'_b':[3,3,5,3,7,3,9], '_c':[10,11,12,13,14,15,16], 'a_b_3:[21,21,21,13,13,15,15]'})
df

    _a  _b  _c  a_b_3   
0   1   3   10   21
1   1   3   11   21
2   1   5   12   21
3   2   3   13   13
4   2   7   14   13
5   3   3   15   15
6   3   9   16   15

I need create column a_b_3 (sum all values _c for _b=3 by _a) use groupby from pandas. Thank you in advance.

Use:

df['a_b_3'] = df['_a'].map(df[df['_b'] == 3].groupby('_a')['_c'].sum())

Output:

   _a  _b  _c  a_b_3
0   1   3  10     21
1   1   3  11     21
2   1   5  12     21
3   2   3  13     13
4   2   7  14     13
5   3   3  15     15
6   3   9  16     15

Explanation

First filter down to only records that have _b equal to 3, then group by _a and sum to create a series. Use that series to map back to _a values in the original dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM