How do I group by two columns and then count the occurrences of each unique value in a third column for each of the groupings?

Question

I have a unique identifier that I want to group by ["EMID"] along with a date column ["DateNew"]. Then I would like to count the number of times each value in BRalpha occurs for each grouping.

Data Set:

EMID	DateNew	BRalpha
SIM10001	2016-06-01	LUMB
SIM10001	2016-06-01	LUMB
SIM10001	2016-07-01	LUMB
SIM10001	2016-07-01	THOR
SIM10002	2016-02-01	NSPC
SIM10002	2016-02-01	NSPC
SIM10002	2016-02-01	NSPC
SIM10002	2016-02-01	NSPC
SIM10002	2016-02-01	NSPC
SIM10003	2017-03-01	ANFT
SIM10003	2017-03-01	ANFT

Desired output:

EMID	DateNew	Count_LUMB	Count_THOR	Count_NSPC	Count_ANFT
SIM10001	2016-06-01	2	0	0	0
SIM10001	2016-07-01	1	1	0	0
SIM10002	2016-02-01	0	0	5	0
SIM10003	2017-03-01	0	0	0	2

Answer 1

print(
    df.groupby(["EMID", "DateNew", "BRalpha"])
    .size()
    .unstack()
    .fillna(0)
    .astype(int)
    .add_prefix("count_")
    .reset_index()
)

Prints:

BRalpha      EMID     DateNew  count_ANFT  count_LUMB  count_NSPC  count_THOR
0        SIM10001  2016-06-01           0           2           0           0
1        SIM10001  2016-07-01           0           1           0           1
2        SIM10002  2016-02-01           0           0           5           0
3        SIM10003  2017-03-01           2           0           0           0

How do I group by two columns and then count the occurrences of each unique value in a third column for each of the groupings?

Question

1 answers

solution1
0 2021-03-17 01:42:08

How do I group by two columns and then count the occurrences of each unique value in a third column for each of the groupings?

Question

1 answers

solution1 0 2021-03-17 01:42:08

solution1
0 2021-03-17 01:42:08