简体   繁体   English

Groupby 并为组成员分配唯一 ID

[英]Groupby and assign unique IDs to group members

I have some DataFrame:我有一些数据帧:

df = pd.DataFrame({'fruit': ['apple', 'apple', 'apple', 'apple', 'orange', 'orange', 'orange', 'orange', 'orange', 'orange'], 
                   'distance': [10, 0, 20, 40, 20, 50 ,70, 90, 110, 130]})
df

fruit   distance
0   apple   10
1   apple   0
2   apple   20
3   apple   40
4   orange  20
5   orange  50
6   orange  70
7   orange  90
8   orange  110
9   orange  130

I would like to add a unique ID to each group member sorted by distance, like this:我想为每个按距离排序的组成员添加一个唯一 ID,如下所示:

    fruit   distance    ID
0   apple   10  apple_2
1   apple   0   apple_1
2   apple   20  apple_3
3   apple   40  apple_4
4   orange  20  orange_1
5   orange  50  orange_2
6   orange  70  orange_3
7   orange  130 orange_6
8   orange  110 orange_5
9   orange  90  orange_4

My efforts to sort/groupby/loop have not yet been successful.我对排序/分组/循环的努力尚未成功。

Using pandas.DataFrame.groupby.rank :使用pandas.DataFrame.groupby.rank

df['ID'] = df['fruit'] + "_" + df.groupby("fruit")["distance"].rank().astype(int).astype(str)
print(df)

Output:输出:

    fruit  distance        ID
0   apple        10   apple_2
1   apple         0   apple_1
2   apple        20   apple_3
3   apple        40   apple_4
4  orange        20  orange_1
5  orange        50  orange_2
6  orange        70  orange_3
7  orange        90  orange_4
8  orange       110  orange_5
9  orange       130  orange_6

IIUC,国际大学联盟,

sort followed by groupby and cumsum and string concatenation. sort后跟groupbycumsum以及字符串连接。

I'm not sure of your sort at the end ?最后我不确定你的类型? - but this should work. - 但这应该有效。

nums = (df.sort_values(["fruit", "distance"]).groupby(["fruit"]).cumcount() + 1).astype(str)

df['ID'] = df['fruit'] + '_' + nums
print(df)
        fruit  distance    ID
0   apple        10   apple_2
1   apple         0   apple_1
2   apple        20   apple_3
3   apple        40   apple_4
4  orange        20  orange_1
5  orange        50  orange_2
6  orange        70  orange_3
7  orange        90  orange_4
8  orange       110  orange_5
9  orange       130  orange_6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM