简体   繁体   中英

group rows based on partial strings from two columns and sum values

df = pd.DataFrame({'c1':['Ax','Ay','Bx','By'], 'c2':['Ay','Ax','By','Bx'], 'c3':[1,2,3,4]})

   c1  c2  c3
0  Ax  Ay   1
1  Ay  Ax   2
2  Bx  By   3
3  By  Bx   4

I'd like to sum the c3 values by aggregating the same xy combinations from the c1 and c2 columns.

The expected output is

  c1  c2  c3
0  x   y   4   #[Ax Ay] + [Bx By]
1  y   x   6   #[Ay Ax] + [By Bx]

You can select values in c1 and c2 without first letters and aggregate sum :

df = df.groupby([df.c1.str[1:], df.c2.str[1:]]).sum().reset_index()
print (df)
  c1 c2  c3
0  x  y   4
1  y  x   6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM