I am trying to assign as ID to a pandas dataframe based on row count. For this I am trying to apply
the below logic to pandas dataframe:
num = df.shape[0]
for i in range(num):
print(math.ceil(i/4))
So the idea is that for every 4 consecutive rows, an ID would be assigned. So the resultant dataframe would look like
col_1 Group_ID
v_1 1
v_2 1
v_3 1
v_4 1
v_5 2
v_6 2
v_7 2
v_8 2
v_9 3
v_10 3
--- And so on.
Just a quick thought. How can I use apply function on df.index
. Can I use the below code?
df['Index'] = df.index
df[GroupID] = df['Index].apply(np.ceil)
Any hints?
You can pass a function to apply
, so create a named function and pass it
def everyFour(rowIdx):
return math.ceil(rowIdx / 4)
df['GroupId'] = df['Index'].apply(everyFour)
or just use a lambda
df['GroupId'] = df['Index'].apply(lambda rowIdx: math.ceil(rowIdx / 4))
Note that this will leave the first row with index 0 at 0, so you might want to add 1 to the rowIndex before dividing by 4.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.