[英]Add number to groups by count of values in Pandas Dataframe
I have a pandas Dataframe with a column I would like to group them by pack of 3 rows and then increment an indice on each pack.我有一个 pandas Dataframe 的列,我想将它们按 3 行的包分组,然后在每个包上递增一个索引。
id protocol protocol_grp
1 ISD ISD1
2 ISD ISD1
3 ISD ISD1
4 IRQ IRQ1
5 IRQ IRQ1
6 IRQ IRQ1
7 IRQ IRQ2
8 IRQ IRQ2
9 IRQ IRQ2
10 IRQ IRQ3
11 ISD ISD2
12 ISD ISD2
13 ISD ISD2
14 ISD ISD3
15 IRQ IRQ3
16 IRQ IRQ3
17 IRQ IRQ4
The desired output is protocol_grp column.所需的 output 是 protocol_grp 列。 What I'd like to be able to do is each time I had 3 same protocols, I increment the indice by 1.
我希望能够做的是每次我有 3 个相同的协议时,我将索引增加 1。
Hopes this make sense.希望这是有道理的。
You can use:您可以使用:
df['protocol_grp'] = df['protocol'] + df.groupby('protocol').cumcount() \
.floordiv(3).add(1).astype(str)
print(df)
# Output
id protocol protocol_grp
0 1 ISD ISD1
1 2 ISD ISD1
2 3 ISD ISD1
3 4 IRQ IRQ1
4 5 IRQ IRQ1
5 6 IRQ IRQ1
6 7 IRQ IRQ2
7 8 IRQ IRQ2
8 9 IRQ IRQ2
9 10 IRQ IRQ3
10 11 ISD ISD2
11 12 ISD ISD2
12 13 ISD ISD2
13 14 ISD ISD3
14 15 IRQ IRQ3
15 16 IRQ IRQ3 # <- check this row
16 17 IRQ IRQ4
Let us check cumcount
then get the divisor让我们检查
cumcount
然后得到除数
df['protocol_grp'] = df['protocol'].add((df.groupby('protocol').cumcount()//3+1).astype(str))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.