简体   繁体   中英

GroupBy Dataframe by one column and get count based on another column

I understand the title might not make much sense. I want a separate count of Action values Yes and No Per Month, given the data below.

Here is my data

    Date      Action
234 2021-03-05   yes
235 2021-03-05   yes
236 2021-03-15   yes
237 2021-03-02    no
238 2021-03-05   yes
..         ...    ...
460 2020-01-10    no
461 2019-12-27    no
462 2019-12-19    no
463 2019-12-18    no
464 2019-12-17    no

Current Code

var = df.groupby(df.dt.strftime("%y-%m")).size().reset_index(name='counts')
var = var .to_dict(orient='records')

Current Output

[{date: "2021-03", count: "10"},{},...]

Desired Output

[{date: "2021-03", "yes": 2, "no": 8},{},...]

Use pandas.DataFrame.groupby.value_counts with unstack :

new_df = df.groupby(df["Date"].dt.strftime("%y-%m"))["Action"].value_counts().unstack()
print(new_df)

Output:

Action   no  yes
Date            
19-12   4.0  NaN
20-01   1.0  NaN
21-03   1.0  4.0

Then you can make them a list of dicts by to_dict with orient=="records" :

new_df.reset_index().to_dict("records")

Output:

[{'Date': '19-12', 'no': 4.0, 'yes': nan},
 {'Date': '20-01', 'no': 1.0, 'yes': nan},
 {'Date': '21-03', 'no': 1.0, 'yes': 4.0}]

Let us do crosstab

output = pd.crosstab(df["Date"].dt.strftime("%y-%m"),df.Action).reset_index().to_dict("records")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM