I have a unique identifier that I want to group by ["EMID"] along with a date column ["DateNew"]. Then I would like to count the number of times each value in BRalpha occurs for each grouping.
Data Set:
EMID | DateNew | BRalpha |
---|---|---|
SIM10001 | 2016-06-01 | LUMB |
SIM10001 | 2016-06-01 | LUMB |
SIM10001 | 2016-07-01 | LUMB |
SIM10001 | 2016-07-01 | THOR |
SIM10002 | 2016-02-01 | NSPC |
SIM10002 | 2016-02-01 | NSPC |
SIM10002 | 2016-02-01 | NSPC |
SIM10002 | 2016-02-01 | NSPC |
SIM10002 | 2016-02-01 | NSPC |
SIM10003 | 2017-03-01 | ANFT |
SIM10003 | 2017-03-01 | ANFT |
Desired output:
EMID | DateNew | Count_LUMB | Count_THOR | Count_NSPC | Count_ANFT |
---|---|---|---|---|---|
SIM10001 | 2016-06-01 | 2 | 0 | 0 | 0 |
SIM10001 | 2016-07-01 | 1 | 1 | 0 | 0 |
SIM10002 | 2016-02-01 | 0 | 0 | 5 | 0 |
SIM10003 | 2017-03-01 | 0 | 0 | 0 | 2 |
print(
df.groupby(["EMID", "DateNew", "BRalpha"])
.size()
.unstack()
.fillna(0)
.astype(int)
.add_prefix("count_")
.reset_index()
)
Prints:
BRalpha EMID DateNew count_ANFT count_LUMB count_NSPC count_THOR
0 SIM10001 2016-06-01 0 2 0 0
1 SIM10001 2016-07-01 0 1 0 1
2 SIM10002 2016-02-01 0 0 5 0
3 SIM10003 2017-03-01 2 0 0 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.