简体   繁体   中英

calculate the percentage contribution of a value in a column in python

I have the below data frame

 item1  item2    item3    
  x      y         z    
  x1     y1        z1   
  x      y2        z2   
  x      y         z1
  x2     y         z         
  x2     y1        z2     

i want to find the percentage contribution of each value in a column to the all value in the column(what is the contribution of x,x1,x2 in item1 same with item2 and item3)

The below must be the result data frame.

item1  %con_item1  item2  %con_item2  item3 %con_item3
x          50       y        50         z       33.33
x1         16.66    y1       33.33      z1      33.33
x2         33.33    y2       16.66      z2      33.33      

Use value_counts with normalize parameter set to True:

pd.concat([df[i].value_counts(normalize=True).reset_index() for i in df.columns], axis=1)

Output:

  index     item1 index     item2 index     item3
0     x  0.500000     y  0.500000    z1  0.333333
1    x2  0.333333    y1  0.333333    z2  0.333333
2    x1  0.166667    y2  0.166667     z  0.333333

Updated answer with scaling and column naming:

pd.concat([df[i].value_counts(normalize=True)
                .mul(100.0)
                .rename_axis(i)
                .reset_index(name='%con_'+i)  for i in df.columns], axis=1)

Output:

  item1  %con_item1 item2  %con_item2 item3  %con_item3
0     x   50.000000     y   50.000000    z1   33.333333
1    x2   33.333333    y1   33.333333    z2   33.333333
2    x1   16.666667    y2   16.666667     z   33.333333

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM