简体   繁体   中英

Split pandas dataframe into groups of 20 and assign column value to each group

I have a df as follows.

TimeStamp,Value
 t1,akak
 t2,bb
 t3,vvv
 t5,ff
 t6,44
 t7,99
 t8,kfkkf
 t9,ff
 t10,oo

I want to split df into sizes of 2 rows and assign class as group number.

TimeStamp,Value, class
 t1,akak,c1
 t2,bb,c1
 t3,vvv,c2
 t4,ff,c2
 t5,44,c3
 t6,99,c3
 t7,kfkkf,c4
 t8,ff,c4
 t9,oo,c5
 t10,oo,c5

One approach is to iterate and do it one at a time. Was thinking of there is inbuilt way in pandas to do it

You could do:

df['class'] = [i//2 for i in range(len(df))]

But this is a pretty limited answer; you might want to apply a certain value on your other columns to get the group ID, or you may have a specific label in mind to apply for the class column, in which case you could follow up with a map function on the series to turn those numbers into something else.

You can use this to achieve what you want:

df["class"] = [f"c{(i // 2) + 1}" for i in range(df.shape[0])]

Another possible solution:

df['class'] = ['c' + str(1+x) for x in np.repeat(range(int(len(df)/2)), 2)]

Output:

  TimeStamp  Value class
0        t1   akak    c1
1        t2     bb    c1
2        t3    vvv    c2
3        t4     ff    c2
4        t5     ff    c3
5        t6     44    c3
6        t7     99    c4
7        t8  kfkkf    c4
8        t9     ff    c5
9       t10     oo    c5

You can vectorize the operation with :

import numpy as np

df['class'] = np.core.defchararray.add('c', (np.arange(len(df))//2+1).astype(str))

Or, with a Series:

df['class'] = pd.Series(np.arange(len(df))//2+1, index=df.index, dtype='string').radd('c')

Output:

  TimeStamp  Value class
0        t1   akak    c1
1        t2     bb    c1
2        t3    vvv    c2
3        t4     ff    c2
4        t5     ff    c3
5        t6     44    c3
6        t7     99    c4
7        t8  kfkkf    c4
8        t9     ff    c5
9       t10     oo    c5

try this:

df.assign(Class=(df.index//2+1).map('c{}'.format))
>>>

TimeStamp   Value   Class
0   t1     akak     c1
1   t2     bb       c1
2   t3     vvv      c2
3   t5     ff       c2
4   t6     44       c3
5   t7     99       c3
6   t8     kfkkf    c4
7   t9     ff       c4
8   t10    oo       c5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM