简体   繁体   中英

For each unique Pandas series value, count an other field

I have a Data Frame like that :

df = pd.DataFrame({"id": [1, 2, 2, 1, 3, 3, 2, 1, 2], 
                   "colors": ['green', 'blue', 'green', 'yellow', 'yellow', 'yellow', 'green', 'red', 'black']

So there are two series with same length but no logical patern between id and colors. Eg id 1 can be blue green, red etc.

That I want to do is to create an other Data Frame with a first row of each unique id and next rows with the count of all possible colors.

In my example, I want the result to be :

id green blue yellow red black
1 1 0 1 1 0
2 2 1 0 0 1
3 0 0 2 0 0

I try : but the == doesn't work.

print(df.loc[df.id.unique() == df["id"], "colors"].value_counts())

You could use pivot_table :

df.pivot_table(index="id", columns="colors", aggfunc=len, fill_value=0)

# out:
colors  black  blue  green  red  yellow
id                                     
1           0     0      1    1       1
2           1     1      2    0       0
3           0     0      0    0       2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM