简体   繁体   中英

Python Assign unique ID to combination of two columns and multiples rows in pandas dataframe

I have a dataframe like this

client colors
Maria Black
Maria Blue
Nadia Blue
Nadia Black
Sophie Blue
Maud Red

Each client has one or more colors and I want to create

  • an ID_combinaison for each combinaision of colors that exists, for exemple in the table above we have three combinaisions (Black/Blue, Blue, Red) and for each one I want to assigniate a unique ID.
  • For each ID_combinaision, the number_clients in it (for example there are 2 clients in the combinaision Black/Blue and in the Blue)

The result would look like this:

ID_combinaision colors number_clients
1 Black 2
1 Blue 2
2 Blue 1
3 Red 1

How can I do this?

Thanks!

try:

df
    client  colors
0   Maria   Black
1   Maria   Blue
2   Nadia   Blue
3   Nadia   Black
4   Sophie  Blue
5   Maud    Red

#1. combinaision of colors that exists for a client

df1 = df.groupby('client')['colors'].apply(list).reset_index()
df1['nbr_of_colors_per_client'] = df1['colors'].map(len)
df1
    client  colors          nbr_of_colors_per_client
0   Maria   [Black, Blue]   2
1   Maud    [Red]           1
2   Nadia   [Blue, Black]   2
3   Sophie  [Blue]          1

#2. combinaision of clients that exists for each color

df2 = df.groupby('colors')['client'].apply(list).reset_index()
df2['nbr_of_clients_per_color'] = df2['client'].map(len)

df2
    colors  client                  nbr_of_clients_per_color
0   Black   [Maria, Nadia]          2
1   Blue    [Maria, Nadia, Sophie]  3
2   Red     [Maud]                  1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM