![](/img/trans.png)
[英]Pandas: Calculate the percentage between two rows and add the value as a column
[英]Trying to Calculate a percentage and add the new column using Pandas
我有一個使用 groupby 創建的 Pandas 數據框,返回結果是這樣的:
loan_type
type
risky 23150
safe 99457
我想創建一個名為 pct 的列並將其添加到我這樣做的數據框中:
total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)
結果是這樣的:
loan_type pct
type
risky 23150 NaN
safe 99457 NaN
在這一點上,我不確定我需要做什么才能獲得該百分比列,請參閱下面的代碼,了解我如何創建整個內容:
import numpy as np
bad_loans = np.array(club['bad_loans'])
for index, row in enumerate(bad_loans):
if row == 0:
bad_loans[index] = 1
else:
bad_loans[index] = -1
loans = pd.DataFrame({'loan_type' : bad_loans})
loans['type'] = np.where(loans['loan_type'] == 1, 'safe', 'risky')loans = np.absolute(loans.groupby(['type']).agg({'loan_type': 'sum'}))
total = loans.sum(numeric_only=True)
loans['pct'] = loans.apply(lambda x:x/ total)
有一個問題,你不希望除以值,而是除以一個值Series
並且因為不對齊indexes
得到NaN
s。
我認為最簡單的是將Series
total
轉換為numpy array
:
total = loans.sum(numeric_only=True)
loans['pct'] = loans.loan_type / total.values
print (loans)
loan_type pct
type
risky 23150 0.188815
safe 99457 0.811185
或者通過索引[0]
轉換選擇 - 輸出是數字:
total = loans.sum(numeric_only=True)[0]
loans['pct'] = loans.loan_type / total
print (loans)
loan_type pct
type
risky 23150 0.188815
safe 99457 0.811185
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.