简体   繁体   中英

How do I group by two columns and then count the occurrences of each unique value in a third column for each of the groupings?

I have a unique identifier that I want to group by ["EMID"] along with a date column ["DateNew"]. Then I would like to count the number of times each value in BRalpha occurs for each grouping.

Data Set:

EMID DateNew BRalpha
SIM10001 2016-06-01 LUMB
SIM10001 2016-06-01 LUMB
SIM10001 2016-07-01 LUMB
SIM10001 2016-07-01 THOR
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10002 2016-02-01 NSPC
SIM10003 2017-03-01 ANFT
SIM10003 2017-03-01 ANFT

Desired output:

EMID DateNew Count_LUMB Count_THOR Count_NSPC Count_ANFT
SIM10001 2016-06-01 2 0 0 0
SIM10001 2016-07-01 1 1 0 0
SIM10002 2016-02-01 0 0 5 0
SIM10003 2017-03-01 0 0 0 2
print(
    df.groupby(["EMID", "DateNew", "BRalpha"])
    .size()
    .unstack()
    .fillna(0)
    .astype(int)
    .add_prefix("count_")
    .reset_index()
)

Prints:

BRalpha      EMID     DateNew  count_ANFT  count_LUMB  count_NSPC  count_THOR
0        SIM10001  2016-06-01           0           2           0           0
1        SIM10001  2016-07-01           0           1           0           1
2        SIM10002  2016-02-01           0           0           5           0
3        SIM10003  2017-03-01           2           0           0           0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM