如何將計算的百分比添加到熊貓數據透視表

Question

我有一個與此問題類似的樞紐，似乎沒有答案。 我有一個稱為grouped的數據透視表，如下所示：

grouped = age_gender_bkts.pivot_table('population_in_thousands',index='gender',
columns='country_destination', aggfunc='sum').unstack()

這摘自熊貓數據框age_gender_bkts：

age_gender_bkts = pd.read_csv('airbnb/age_gender_bkts.csv')
age_gender_bkts[:10]

  age_bucket country_destination gender  population_in_thousands  year
0       100+                  AU   male                        1  2015
1      95-99                  AU   male                        9  2015
2      90-94                  AU   male                       47  2015
3      85-89                  AU   male                      118  2015
4      80-84                  AU   male                      199  2015
5      75-79                  AU   male                      298  2015
6      70-74                  AU   male                      415  2015
7      65-69                  AU   male                      574  2015
8      60-64                  AU   male                      636  2015
9      55-59                  AU   male                      714  2015

我希望獲得每個國家的男女性別population_in_thousands ，以％為單位，例如AU 12024/11899+12024 。

我是numpy的熊貓新手，正在尋找一種通用的解決方案來基於pivot_table計算列。 另外，如果回復中有一種方法可以讓我按性別和國家（地區）創建這些分組，而無需使用pivot_table ，例如groupby （我無法弄清楚），那將對我的學習有所幫助。

Answer 1

您可以使用groupby ， transform和sum 。 最后，您可以merge數據merge到原始DataFrame ：

print age_gender_bkts
  age_bucket country_destination gender  population_in_thousands  year
0       100+                  AU   male                        1  2015
1      95-99                  AU   male                        9  2015
2      90-94                  CA   male                       47  2015
3      85-89                  CA   male                      118  2015
4      80-84                  AU   male                      199  2015
5      75-79                  NL   male                      298  2015
6      70-74                  NL   male                      415  2015
7      65-69                  AU   male                      574  2015
8      60-64                  AU   male                      636  2015
9      55-59                  AU   male                      714  2015

grouped = age_gender_bkts.pivot_table('population_in_thousands',index='gender', columns='country_destination', aggfunc='sum').unstack()
df  = (grouped / grouped.groupby(level=0).transform(sum)).reset_index().rename(columns={0:'prop'})
print df
  country_destination gender  prop
0                  AU   male     1
1                  CA   male     1
2                  NL   male     1

print pd.merge(age_gender_bkts, df, on=['country_destination', 'gender'])
  age_bucket country_destination gender  population_in_thousands  year  prop
0       100+                  AU   male                        1  2015     1
1      95-99                  AU   male                        9  2015     1
2      80-84                  AU   male                      199  2015     1
3      65-69                  AU   male                      574  2015     1
4      60-64                  AU   male                      636  2015     1
5      55-59                  AU   male                      714  2015     1
6      90-94                  CA   male                       47  2015     1
7      85-89                  CA   male                      118  2015     1
8      75-79                  NL   male                      298  2015     1
9      70-74                  NL   male                      415  2015     1

如何將計算的百分比添加到熊貓數據透視表

問題描述

1 個解決方案

解決方案1
0 已采納 2016-02-05 13:20:01

如何將計算的百分比添加到熊貓數據透視表

問題描述

1 個解決方案

解決方案1 0 已采納 2016-02-05 13:20:01

解決方案1
0 已采納 2016-02-05 13:20:01