简体   繁体   中英

How can I groupby a DataFrame at the same time I count the values and put in different columns?

I have a DataFrame that looks like the one below

Index  Category  Class
 0        1        A
 1        1        A
 2        1        B
 3        2        A
 4        3        B
 5        3        B

And I would like to get an output data frame that groups by category and have one column for each of the classes with the counting of the occurrences of that class in each category, such as the one below

Index Category   A   B
 0      1        2   1
 1      2        1   0
 2      3        0   2

So far I've tried various combinations of the groupby and agg methods, but I still can't get what I want. I've also tried df.pivot_table(index='Category', columns='Class', aggfunc='count') , but that return a DataFrame without columns. Any ideas of what could work in this case?

Assign a dummy value to count:

out = df.assign(val=1).pivot_table('val', 'Category', 'Class',
                                   aggfunc='count', fill_value=0).reset_index()
print(out)

# Output
Class  Category  A  B
0             1  2  1
1             2  1  0
2             3  0  2
    import pandas as pd
df = pd.DataFrame({'Index':[0,1,2,3,4,5],
                   'Category': [1,1,1,2,3,3],
                   'Class':['A','A','B','A','B','B'],
              })
df = df.groupby(['Category', 'Class']).count()
df = df.pivot_table(index='Category', columns='Class')
print(df)

output:

             Index     
Class        A    B
Category           
1          2.0  1.0
2          1.0  NaN
3          NaN  2.0

You can use aggfunc="size" to achieve your desired result:

>>> df.pivot_table(index='Category', columns='Class', aggfunc='size', fill_value=0)

Class     A  B
Category
1         2  1
2         1  0
3         0  2

Alternatively, you can use .groupby(...).size() to get the counts, and then unstack to reshape your data as well:

>>> df.groupby(["Category", "Class"]).size().unstack(fill_value=0)

Class     A  B
Category
1         2  1
2         1  0
3         0  2

Use crosstab :

pd.crosstab(df['Category'], df['Class']).reset_index()

output:

Class  Category  A  B
0             1  2  1
1             2  1  0
2             3  0  2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM