简体   繁体   中英

Pandas dataframe row counts to plt incrementally after every 4 consecutive rows

I am trying to assign as ID to a pandas dataframe based on row count. For this I am trying to apply the below logic to pandas dataframe:

num = df.shape[0]
for i in range(num):
   print(math.ceil(i/4))

So the idea is that for every 4 consecutive rows, an ID would be assigned. So the resultant dataframe would look like

  col_1    Group_ID
   v_1        1
   v_2        1
   v_3        1
   v_4        1
   v_5        2
   v_6        2
   v_7        2
   v_8        2
   v_9        3
   v_10       3

--- And so on.

Just a quick thought. How can I use apply function on df.index . Can I use the below code?

df['Index'] = df.index
df[GroupID] = df['Index].apply(np.ceil)

Any hints?

You can pass a function to apply , so create a named function and pass it

def everyFour(rowIdx):
    return math.ceil(rowIdx / 4)

df['GroupId'] = df['Index'].apply(everyFour)

or just use a lambda

df['GroupId'] = df['Index'].apply(lambda rowIdx: math.ceil(rowIdx / 4))

Note that this will leave the first row with index 0 at 0, so you might want to add 1 to the rowIndex before dividing by 4.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM