简体   繁体   中英

Repeating Counts in Pandas Data Frame

import pandas as pd
df = pd.DataFrame({
      'item':['a','b','c','d','e','f','g','h','i','k'],
      'counter':[1,2,3,1,2,3,1,2,3,1]
      })

Given this structure, what is the best way to auto-generate df['counter'] as a repeating range of integers, cycling through 1, 2, and 3 until it comes to the last row?

You can do:

df["counter_gen"] = df.index % 3 + 1

+1 will get rid of the zero since mod starts from zero, and the 3 is determined by you.

By using np.put

a=df.index.values
a
Out[637]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64)
np.put(a,a,np.array([1,2,3]))
a
Out[639]: array([1, 2, 3, 1, 2, 3, 1, 2, 3, 1], dtype=int64)
df['New']=a
df
Out[641]: 
   counter item  New
1        1    a    1
2        2    b    2
3        3    c    3
1        1    d    1
2        2    e    2
3        3    f    3
1        1    g    1
2        2    h    2
3        3    i    3
1        1    k    1

If performance is crucial, you may be able to make use of something like

np.repeat([[1, 2, 3]], len(df)/3 + 1, 0).ravel()

For a length 10^6 data frame, this is roughly 8 times faster to generate than the (much more elegant) df.index % 3 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM