简体   繁体   中英

Trying to Calculate a percentage and add the new column using Pandas

I have a pandas dataframe that I created using groupby and the return result is this:

          loan_type
type            
risky      23150
safe       99457

I want to create a column called pct and add it to the dataframe I did this:

total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)

And the result was this:

       loan_type  pct
type                 
risky      23150  NaN
safe       99457  NaN

At this point I'm not sure what I need to do to get that percentage column see the code below as to how I created the whole thing:

import numpy as np
bad_loans = np.array(club['bad_loans'])

for index, row in enumerate(bad_loans):
    if row == 0:
        bad_loans[index] = 1
    else:
        bad_loans[index] = -1

loans = pd.DataFrame({'loan_type' : bad_loans})
loans['type'] = np.where(loans['loan_type'] == 1, 'safe', 'risky')loans = np.absolute(loans.groupby(['type']).agg({'loan_type': 'sum'}))
total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)

There is problem you want divide not by value, but one value Series and because not align indexes get NaN s.

I think the simpliest is convert Series total to numpy array :

total = loans.sum(numeric_only=True)
loans['pct'] = loans.loan_type / total.values

print (loans)
       loan_type       pct
type                      
risky      23150  0.188815
safe       99457  0.811185

Or convert select by indexing [0] - output is number:

total = loans.sum(numeric_only=True)[0]
loans['pct'] = loans.loan_type / total

print (loans)
       loan_type       pct
type                      
risky      23150  0.188815
safe       99457  0.811185

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM