简体   繁体   中英

New column in pandas multiindex dataframe based on existing column

I couldn't find an answer to this specific problem, so thought I'd share.

Question

Given the following dataframe:

>>>: import pandas as pd

>>>: df = pd.DataFrame({
...: 'A': [1, 1, 2, 2],
...: 'B': ['a', 'b', 'a', 'b'],
...: 'C': [1, 2, 3, 4]
...: }).set_index(['A','B'])

>>>: df
     C
A B   
1 a  1
  b  2
2 a  3
  b  4

How can you add a new column D whose values are a function of the C values grouped under each A ?

Answer

>>>: df['D'] = df.groupby('A')['C'].transform(lambda x: x / x.sum())

>>>: df
     C         D
A B             
1 a  1  0.333333
  b  2  0.666667
2 a  3  0.428571
  b  4  0.571429

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM