[英]How do I group by two columns and then count the occurrences of each unique value in a third column for each of the groupings?
I have a unique identifier that I want to group by ["EMID"] along with a date column ["DateNew"].我有一个唯一标识符,我想按 ["EMID"] 以及日期列 ["DateNew"] 对其进行分组。 Then I would like to count the number of times each value in BRalpha occurs for each grouping.
然后我想计算 BRalpha 中每个值在每个分组中出现的次数。
Data Set:数据集:
EMID ![]() |
DateNew![]() |
BRalpha ![]() |
---|---|---|
SIM10001 ![]() |
2016-06-01 ![]() |
LUMB![]() |
SIM10001 ![]() |
2016-06-01 ![]() |
LUMB![]() |
SIM10001 ![]() |
2016-07-01 ![]() |
LUMB![]() |
SIM10001 ![]() |
2016-07-01 ![]() |
THOR![]() |
SIM10002 ![]() |
2016-02-01 ![]() |
NSPC ![]() |
SIM10002 ![]() |
2016-02-01 ![]() |
NSPC ![]() |
SIM10002 ![]() |
2016-02-01 ![]() |
NSPC ![]() |
SIM10002 ![]() |
2016-02-01 ![]() |
NSPC ![]() |
SIM10002 ![]() |
2016-02-01 ![]() |
NSPC ![]() |
SIM10003 ![]() |
2017-03-01 ![]() |
ANFT ![]() |
SIM10003 ![]() |
2017-03-01 ![]() |
ANFT ![]() |
Desired output:所需的 output:
EMID ![]() |
DateNew![]() |
Count_LUMB![]() |
Count_THOR ![]() |
Count_NSPC ![]() |
Count_ANFT ![]() |
---|---|---|---|---|---|
SIM10001 ![]() |
2016-06-01 ![]() |
2 ![]() |
0 ![]() |
0 ![]() |
0 ![]() |
SIM10001 ![]() |
2016-07-01 ![]() |
1 ![]() |
1 ![]() |
0 ![]() |
0 ![]() |
SIM10002 ![]() |
2016-02-01 ![]() |
0 ![]() |
0 ![]() |
5 ![]() |
0 ![]() |
SIM10003 ![]() |
2017-03-01 ![]() |
0 ![]() |
0 ![]() |
0 ![]() |
2 ![]() |
print(
df.groupby(["EMID", "DateNew", "BRalpha"])
.size()
.unstack()
.fillna(0)
.astype(int)
.add_prefix("count_")
.reset_index()
)
Prints:印刷:
BRalpha EMID DateNew count_ANFT count_LUMB count_NSPC count_THOR
0 SIM10001 2016-06-01 0 2 0 0
1 SIM10001 2016-07-01 0 1 0 1
2 SIM10002 2016-02-01 0 0 5 0
3 SIM10003 2017-03-01 2 0 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.