[英]Trying to Calculate a percentage and add the new column using Pandas
I have a pandas dataframe that I created using groupby and the return result is this:我有一个使用 groupby 创建的 Pandas 数据框,返回结果是这样的:
loan_type
type
risky 23150
safe 99457
I want to create a column called pct and add it to the dataframe I did this:我想创建一个名为 pct 的列并将其添加到我这样做的数据框中:
total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)
And the result was this:结果是这样的:
loan_type pct
type
risky 23150 NaN
safe 99457 NaN
At this point I'm not sure what I need to do to get that percentage column see the code below as to how I created the whole thing:在这一点上,我不确定我需要做什么才能获得该百分比列,请参阅下面的代码,了解我如何创建整个内容:
import numpy as np
bad_loans = np.array(club['bad_loans'])
for index, row in enumerate(bad_loans):
if row == 0:
bad_loans[index] = 1
else:
bad_loans[index] = -1
loans = pd.DataFrame({'loan_type' : bad_loans})
loans['type'] = np.where(loans['loan_type'] == 1, 'safe', 'risky')loans = np.absolute(loans.groupby(['type']).agg({'loan_type': 'sum'}))
total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)
There is problem you want divide not by value, but one value Series
and because not align indexes
get NaN
s.有一个问题,你不希望除以值,而是除以一个值Series
并且因为不对齐indexes
得到NaN
s。
I think the simpliest is convert Series
total
to numpy array
:我认为最简单的是将Series
total
转换为numpy array
:
total = loans.sum(numeric_only=True)
loans['pct'] = loans.loan_type / total.values
print (loans)
loan_type pct
type
risky 23150 0.188815
safe 99457 0.811185
Or convert select by indexing [0]
- output is number:或者通过索引[0]
转换选择 - 输出是数字:
total = loans.sum(numeric_only=True)[0]
loans['pct'] = loans.loan_type / total
print (loans)
loan_type pct
type
risky 23150 0.188815
safe 99457 0.811185
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.