熊猫比例

Question

I have a dataset that looks like this 我有一个看起来像这样的数据集

  Location  Type  Number
     House    A      4
              B      1
     Garden   A      3
              B      2

I am trying to find a way to create a column of proportion of type B in each location. 我试图找到一种方法来在每个位置创建一个按B类型比例排列的列。

Expected output - 预期产量-

Location  Type  Number Proportion_B
 House    A      4        20%
          B      1        20%  
 Garden   A      3        40%
          B      2        40%

How can I achieve this ? 我该如何实现？

Answer 1

Use: 采用：

#create MultiIndex
df1 = df.set_index(['Location','Type'])
#if necessary aggregate sum per both levels
#df1 = df1.sum(level=[0,1])

#select B level and divide by sum
df2 = df1.xs('B', level=1).div(df1.sum(level=0), level=1).mul(100).add_prefix('prop_B_')
print (df2)
          prop_B_Number
Location               
House              20.0
Garden             40.0

#join to original DataFrame
df = df.join(df2, on='Location')
print (df)
  Location Type  Number  prop_B_Number
0    House    A       4           20.0
1    House    B       1           20.0
2   Garden    A       3           40.0
3   Garden    B       2           40.0

Answer 2

I tried this way, 我尝试过这种方式

temp= df.groupby('Location').apply(lambda x: ((x[x['Type']=='B']['Number']/x['Number'].sum())*100)).reset_index().rename(columns={'Number':'Proportion_B'})
temp=temp[['Location','Proportion_B']]
temp['Proportion_B']=temp['Proportion_B'].astype(str).str.replace('\.0','')+'%'
df=pd.merge(df,temp,how='left',on=['Location'])

Output: 输出：

  Location Type  Number Proportion_B
0    House    A       4          20%
1    House    B       1          20%
2   Garden    A       3          40%
3   Garden    B       2          40%

Explanation: 说明：

Group the elements by Location , then divide the total number with only B and save this result. 按Location将元素分组，然后将总数除以B并保存此结果。
Merge the temp result with original df based on Location' using left` merge. Location' using左合并Location' using根据“ Location' using将临时结果与原始df合并。

Note: Line 2, Line 3 for getting the same sample output. 注意：第2行，第3行用于获得相同的样本输出。

Answer 3

Maybe this on 也许这个

df_temp = df.groupby('Location').apply(lambda x: ((x[x['Type']=='B']['Number']/x['Number'].sum())*100)).reset_index().rename(columns={'Number':'Proportion_B'})
df=pd.merge(df,df_temp,how='left',on=['Location'])

熊猫比例

问题描述

3 个解决方案

解决方案1
0 2018-10-09 11:42:29

解决方案2
0 2018-10-09 11:55:58

解决方案3
0 2018-10-09 12:46:13

熊猫比例

问题描述

3 个解决方案

解决方案1 0 2018-10-09 11:42:29

解决方案2 0 2018-10-09 11:55:58

解决方案3 0 2018-10-09 12:46:13

解决方案1
0 2018-10-09 11:42:29

解决方案2
0 2018-10-09 11:55:58

解决方案3
0 2018-10-09 12:46:13