New column in pandas multiindex dataframe based on existing column

Question

I couldn't find an answer to this specific problem, so thought I'd share.

Question

Given the following dataframe:

>>>: import pandas as pd

>>>: df = pd.DataFrame({
...: 'A': [1, 1, 2, 2],
...: 'B': ['a', 'b', 'a', 'b'],
...: 'C': [1, 2, 3, 4]
...: }).set_index(['A','B'])

>>>: df
     C
A B   
1 a  1
  b  2
2 a  3
  b  4

How can you add a new column D whose values are a function of the C values grouped under each A ?

Answer 1

Answer

>>>: df['D'] = df.groupby('A')['C'].transform(lambda x: x / x.sum())

>>>: df
     C         D
A B             
1 a  1  0.333333
  b  2  0.666667
2 a  3  0.428571
  b  4  0.571429

New column in pandas multiindex dataframe based on existing column

Question

Question

1 answers

solution1
0 2021-02-10 16:01:50

Answer

New column in pandas multiindex dataframe based on existing column

Question

Question

1 answers

solution1 0 2021-02-10 16:01:50

Answer

solution1
0 2021-02-10 16:01:50