Pandas DataFrame：计算行之间的百分比差异？

Question

I have a year wise dataframe with each year has three parameters year , type and value .我有一年明智的 dataframe ，每年都有三个参数year ， type和value 。 I'm trying to calculate percentage of taken vs empty.我正在尝试计算占用与空置的百分比。 For example year 2014 has total of 50 empty and 50 taken - So 50% in empty and 50% in taken as shown in final_df例如， 2014总共有50 empty的和50 taken的 - 所以 50% 的空和 50% 的占用，如 final_df 所示

df df

    year     type          value
            
0     2014  Total          100
1     2014  Empty           50
2     2014  Taken           50
3     2013  Total          2000
4     2013  Empty          100
5     2013  Taken          1900
6     2012  Total          50
7     2012  Empty          45
8     2012  Taken           5

Final df最终df

    year    Empty          Taken
            
0   2014    50             50 
0   2013    ...            ...    
0   2012    ...            ...

Should i shift cells up and do the percentage calculate or any other method?我应该向上移动单元格并计算百分比还是任何其他方法？

Answer 1

You can use pivot_table :您可以使用pivot_table ：

new = df[df['type'] != 'Total']
res = (new.pivot_table(index='year',columns='type',values='value').sort_values(by='year',ascending=False).reset_index())

which gets you:这让你：

res
      year  Empty  Taken
0     2014     50     50
1     2013    100   1900
2     2012     45      5

And then you can get the percentages for each column:然后你可以得到每列的百分比：

total = (res['Empty'] + res['Taken'])
for col in ['Empty','Taken']:
    res[col+'_perc'] = res[col] / total


year  Empty  Taken  Empty_perc  Taken_perc                                     
2014     50     50        0.50        0.50
2013    100   1900        0.05        0.95
2012     45      5        0.90        0.10

Answer 2

As @sophods pointed out, you can use pivot_table to rearange your dataframe, however, to add to his answer;正如@sophods 指出的那样，您可以使用pivot_table重新排列您的 dataframe，但是，以添加到他的答案中； i think you're after the percentage, hence i suggest you keep the 'Total' record and then apply your calculation:我认为您追求的是百分比，因此我建议您保留“总计”记录，然后应用您的计算：

#pivot your data
res = (df.pivot_table(index='year',columns='type',values='value')).reset_index()
#calculate percentages of empty and taken
res['Empty'] = res['Empty']/res['Total']
res['Taken'] = res['Taken']/res['Total']
#final dataframe
res = res[['year', 'Empty', 'Taken']]

Answer 3

You can filter out records having Empty and Taken in type and then groupby year and apply func .您可以过滤掉type为 Empty 和 Taken 的记录，然后按年份groupby并应用func 。 In func , you can set the type as index and then get the required values and calculate the percentage.在func中，您可以将类型设置为索引，然后获取所需的值并计算百分比。 x in func would be dataframe having type and value columns and data per group. func 中的 x 将是 dataframe ，每组具有type和value列和数据。

 def func(x):
    x = x.set_index('type')
    total = x['value'].sum()
    return [(x.loc['Empty', 'value']/total)*100, (x.loc['Taken', 'value']/total)*100]

temp = (df[df['type'].isin({'Empty', 'Taken'})]
        .groupby('year')[['type', 'value']]
        .apply(lambda x: func(x)))
temp

year
2012    [90.0, 10.0]
2013    [5.0, 95.0] 
2014    [50.0, 50.0]
dtype: object

Convert the result into the required dataframe将结果转换为所需的 dataframe

pd.DataFrame(temp.values.tolist(), index=temp.index, columns=['Empty', 'Taken'])
       Empty    Taken
year        
2012    90.0    10.0
2013    5.0     95.0
2014    50.0    50.0

Pandas DataFrame：计算行之间的百分比差异？

问题描述

3 个解决方案

解决方案1
3 已采纳 2021-02-02 17:27:41

解决方案2
1 2021-02-02 17:31:53

解决方案3
0 2021-02-02 17:38:15

Pandas DataFrame：计算行之间的百分比差异？

问题描述

3 个解决方案

解决方案1 3 已采纳 2021-02-02 17:27:41

解决方案2 1 2021-02-02 17:31:53

解决方案3 0 2021-02-02 17:38:15

解决方案1
3 已采纳 2021-02-02 17:27:41

解决方案2
1 2021-02-02 17:31:53

解决方案3
0 2021-02-02 17:38:15