[英]Add new column indicating count in a pandas dataframe
I have a dataframe with some replicated rows我有一个包含一些复制行的数据框
item h2 h3 h4
----------------
foo v1 ... ...
foo v2 ... ...
foo v1 ... ...
foo v2 ... ...
foo v1 ... ...
foo v2 ... ...
foo v1 ... ...
foo v2 ... ...
bar v5 ... ...
bar v6 ... ...
bar v7 ... ...
bar v5 ... ...
bar v6 ... ...
bar v7 ... ...
My goal is to add a column ( new_id
) in this dataframe which indicates an incrementing count of duplicate blocks (block being a set of rows that have the same item
name) prefixed with the value in the item
column (if it helps, the replicated blocks will be consecutive)我的目标是在此数据框中添加一列(
new_id
),该列指示重复块(块是具有相同item
名称的一组行)的递增计数,并以item
列中的值作为前缀(如果有帮助,复制的块将是连续的)
item h2 h3 h4 new_id
-----------------------
foo v1 ... ... foo1
foo v2 ... ... foo1
foo v1 ... ... foo2
foo v2 ... ... foo2
foo v1 ... ... foo3
foo v2 ... ... foo3
foo v1 ... ... foo4
foo v2 ... ... foo4
bar v5 ... ... bar1
bar v6 ... ... bar1
bar v7 ... ... bar1
bar v5 ... ... bar2
bar v6 ... ... bar2
bar v7 ... ... bar2
Suggestions on how to accomplish this?关于如何实现这一点的建议?
Use GroupBy.cumcount
by both columns item
and h2
:通过
item
和h2
列使用GroupBy.cumcount
:
df['new_id'] = df['item'] + '_' + df.groupby(['item','h2']).cumcount().add(1).astype(str)
print (df)
item h2 h3 h4 new_id
0 foo v1 ... ... foo_1
1 foo v2 ... ... foo_1
2 foo v1 ... ... foo_2
3 foo v2 ... ... foo_2
4 foo v1 ... ... foo_3
5 foo v2 ... ... foo_3
6 foo v1 ... ... foo_4
7 foo v2 ... ... foo_4
8 bar v5 ... ... bar_1
9 bar v6 ... ... bar_1
10 bar v7 ... ... bar_1
11 bar v5 ... ... bar_2
12 bar v6 ... ... bar_2
13 bar v7 ... ... bar_2
Use str.cat()
to concat column item
with the cummulative count of each group in h2
.使用
str.cat()
将列item
与h2
中每个组的累积计数连接起来。 Obviously the cummulative count begins from zero, offset it by 1显然累积计数从零开始,将其偏移 1
df.item.str.cat((df.groupby('h2').cumcount()+1).astype(str),sep='')
item h2 h3 h4 new_id
0 foo v1 ... ... foo1
1 foo v2 ... ... foo1
2 foo v1 ... ... foo2
3 foo v2 ... ... foo2
4 foo v1 ... ... foo3
5 foo v2 ... ... foo3
6 foo v1 ... ... foo4
7 foo v2 ... ... foo4
8 bar v5 ... ... bar1
9 bar v6 ... ... bar1
10 bar v7 ... ... bar1
11 bar v5 ... ... bar2
12 bar v6 ... ... bar2
13 bar v7 ... ... bar2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.