简体   繁体   中英

pandas crosstab simplified view of multiple columns

I already referred the posts here and here . Don't mark it as duplicate

I have a dataframe like as below

id,status,country,amount,qty
1,pass,USA,123,4500
1,pass,USA,156,3210
1,fail,UK,687,2137
1,fail,UK,456,1236
2,pass,AUS,216,324
2,pass,AUS,678,241
2,nan,ANZ,637,213
2,pass,ANZ,213,543

sf = pd.read_clipboard(sep=',')

I would like to get the percentage of values from each column as a seperate column

So, with the help of this post , I tried the below

Approach - 1 Doesn't give expected output shape

(pd.crosstab(sf['id'],[sf['status'].fillna('nan'),sf['country'].fillna('nan')],normalize=0)
          .drop('nan', 1)
          .mul(100)).reset_index()

Approach - 2 - Doesn't give expected output

sf_inv= sf.melt()
pd.crosstab(sf_inv.value, sf_inv.variable)

I expect my output to be like as below

在此处输入图像描述

You can use crosstab with normalize='index' on your different columns andconcat the results:

pd.concat([pd.crosstab(sf['id'], sf[c], normalize='index')
           for c in ['status', 'country']], axis=1).mul(100).add_suffix('_pct')

output:

    fail_pct  pass_pct  ANZ_pct  AUS_pct  UK_pct  USA_pct
id                                                       
1       50.0      50.0      0.0      0.0    50.0     50.0
2        0.0     100.0     50.0     50.0     0.0      0.0
handling NaNs:
pd.concat([pd.crosstab(sf['id'], sf[c].fillna('NA'), normalize='index')
             .drop(columns='NA', errors='ignore')
           for c in ['status', 'country']], axis=1).mul(100).add_suffix('_pct')

output:

    fail_pct  pass_pct  ANZ_pct  AUS_pct  UK_pct  USA_pct
id                                                       
1       50.0      50.0      0.0      0.0    50.0     50.0
2        0.0      75.0     50.0     50.0     0.0      0.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM