对熊猫数据框的分组和聚合操作

Question

I have a Pandas dataframe: 我有一个熊猫数据框：

     Date     Type     Section      Status   
--------------------------------------------
0     1-Apr    Type1       A         Present
1     1-Apr    Type2       A         Absent
2     1-Apr    Type2       A         Present
3     1-Apr    Type1       B         Absent
4     2-Apr    Type1       A         Present
5     2-Apr    Type2       C         Present
6     2-Apr    Type2       C         Present

I'd like to groupby the DF into a bit different format: 我想将DF分组为一些不同的格式：

     Date     Type     A_Pre  A_Abs   B_Pre   B_Abs    C_Pre   C_Abs   
------------------------------------------------------------------------------
0     1-Apr    Type1       1    0       0       1        0        0 
1              Type2       1    1       0       0        0        0
2     2-Apr    Type1       1    0       0       0        0        0         
3              Type2       0    0       0       0        1        1

I want to get an aggregated report from the original table where the entries are grouped by Date and Type and then split into various types. 我想从原始表中获取汇总报告，在该表中，条目按日期和类型分组，然后分成各种类型。 I have not idea how to handle this approach after 2 days of trying. 经过2天的尝试，我不知道如何处理此方法。

Any help would be greatly appreciated. 任何帮助将不胜感激。

Answer 1

Firstly I would create the columns you wish to aggregate populated with zeros and ones, and then use groupby and do a simple sum of the values... 首先，我将创建要聚合的以零和一填充的列，然后使用groupby并对这些值进行简单的求和...

I didnt get to try this out, but I think the following should work: 我没有尝试一下，但是我认为以下应该可行：

Present = ['A_Pre',  'B_Pre',  'C_Pre' ]
Absent = ['A_Abs',  'B_Abs',  'C_Abs' ]

for string in Present:
    DF[string] = pd.Series([1 if stat == 'Present' and sect == string[0] else 0 
                            for stat, sect in zip(DF['Status'], DF['Section'])], 
                            index = DF.index)
for string in Absent:
    DF[string] = pd.Series([1 if stat == 'Absent' and sect == string[0] else 0 
                            for stat, sect in zip(DF['Status'], DF['Section'])], 
                            index = DF.index)

DF.groupby(['Date', 'type']).agg(sum)

对熊猫数据框的分组和聚合操作

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-05-11 02:11:49

对熊猫数据框的分组和聚合操作

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-05-11 02:11:49

解决方案1
1 已采纳 2014-05-11 02:11:49