简体   繁体   中英

Add new column indicating count in a pandas dataframe

I have a dataframe with some replicated rows

item h2 h3  h4
----------------
foo  v1 ... ...
foo  v2 ... ...
foo  v1 ... ...
foo  v2 ... ...
foo  v1 ... ...
foo  v2 ... ...
foo  v1 ... ...
foo  v2 ... ...
bar  v5 ... ...
bar  v6 ... ...
bar  v7 ... ...
bar  v5 ... ...
bar  v6 ... ...
bar  v7 ... ...

My goal is to add a column ( new_id ) in this dataframe which indicates an incrementing count of duplicate blocks (block being a set of rows that have the same item name) prefixed with the value in the item column (if it helps, the replicated blocks will be consecutive)

item h2 h3  h4   new_id
-----------------------
foo  v1 ... ...  foo1
foo  v2 ... ...  foo1
foo  v1 ... ...  foo2
foo  v2 ... ...  foo2
foo  v1 ... ...  foo3
foo  v2 ... ...  foo3
foo  v1 ... ...  foo4
foo  v2 ... ...  foo4
bar  v5 ... ...  bar1
bar  v6 ... ...  bar1
bar  v7 ... ...  bar1
bar  v5 ... ...  bar2
bar  v6 ... ...  bar2
bar  v7 ... ...  bar2

Suggestions on how to accomplish this?

Use GroupBy.cumcount by both columns item and h2 :

df['new_id'] = df['item'] + '_' + df.groupby(['item','h2']).cumcount().add(1).astype(str)
print (df)
   item  h2   h3   h4 new_id
0   foo  v1  ...  ...  foo_1
1   foo  v2  ...  ...  foo_1
2   foo  v1  ...  ...  foo_2
3   foo  v2  ...  ...  foo_2
4   foo  v1  ...  ...  foo_3
5   foo  v2  ...  ...  foo_3
6   foo  v1  ...  ...  foo_4
7   foo  v2  ...  ...  foo_4
8   bar  v5  ...  ...  bar_1
9   bar  v6  ...  ...  bar_1
10  bar  v7  ...  ...  bar_1
11  bar  v5  ...  ...  bar_2
12  bar  v6  ...  ...  bar_2
13  bar  v7  ...  ...  bar_2

Use str.cat() to concat column item with the cummulative count of each group in h2 . Obviously the cummulative count begins from zero, offset it by 1

df.item.str.cat((df.groupby('h2').cumcount()+1).astype(str),sep='')



  item  h2   h3   h4 new_id
0   foo  v1  ...  ...   foo1
1   foo  v2  ...  ...   foo1
2   foo  v1  ...  ...   foo2
3   foo  v2  ...  ...   foo2
4   foo  v1  ...  ...   foo3
5   foo  v2  ...  ...   foo3
6   foo  v1  ...  ...   foo4
7   foo  v2  ...  ...   foo4
8   bar  v5  ...  ...   bar1
9   bar  v6  ...  ...   bar1
10  bar  v7  ...  ...   bar1
11  bar  v5  ...  ...   bar2
12  bar  v6  ...  ...   bar2
13  bar  v7  ...  ...   bar2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM