[英]Groupby and aggregate operation on Pandas Dataframe
I have a Pandas dataframe: 我有一个熊猫数据框:
Date Type Section Status
--------------------------------------------
0 1-Apr Type1 A Present
1 1-Apr Type2 A Absent
2 1-Apr Type2 A Present
3 1-Apr Type1 B Absent
4 2-Apr Type1 A Present
5 2-Apr Type2 C Present
6 2-Apr Type2 C Present
I'd like to groupby the DF into a bit different format: 我想将DF分组为一些不同的格式:
Date Type A_Pre A_Abs B_Pre B_Abs C_Pre C_Abs
------------------------------------------------------------------------------
0 1-Apr Type1 1 0 0 1 0 0
1 Type2 1 1 0 0 0 0
2 2-Apr Type1 1 0 0 0 0 0
3 Type2 0 0 0 0 1 1
I want to get an aggregated report from the original table where the entries are grouped by Date and Type and then split into various types. 我想从原始表中获取汇总报告,在该表中,条目按日期和类型分组,然后分成各种类型。 I have not idea how to handle this approach after 2 days of trying.
经过2天的尝试,我不知道如何处理此方法。
Any help would be greatly appreciated. 任何帮助将不胜感激。
Firstly I would create the columns you wish to aggregate populated with zeros and ones, and then use groupby and do a simple sum of the values... 首先,我将创建要聚合的以零和一填充的列,然后使用groupby并对这些值进行简单的求和...
I didnt get to try this out, but I think the following should work: 我没有尝试一下,但是我认为以下应该可行:
Present = ['A_Pre', 'B_Pre', 'C_Pre' ]
Absent = ['A_Abs', 'B_Abs', 'C_Abs' ]
for string in Present:
DF[string] = pd.Series([1 if stat == 'Present' and sect == string[0] else 0
for stat, sect in zip(DF['Status'], DF['Section'])],
index = DF.index)
for string in Absent:
DF[string] = pd.Series([1 if stat == 'Absent' and sect == string[0] else 0
for stat, sect in zip(DF['Status'], DF['Section'])],
index = DF.index)
DF.groupby(['Date', 'type']).agg(sum)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.