简体   繁体   中英

Pandas Groupby Faster Alternative?

My merged dataframe looks like:

df =
 Datetimes      Symbol      Open    High    Low    Close    Volume  
0   2020-04-15  20MICRONS   26.60   31.40   25.60   27.85   75893
0   2020-04-16  20MICRONS   28.00   28.65   24.65   26.80   87254
0   2020-04-17  20MICRONS   28.80   29.00   26.80   28.75   81116
0   2020-04-15  33MICRONS   26.60   31.40   25.60   27.85   75893
0   2020-04-16  33MICRONS   28.00   28.65   24.65   26.80   87254
0   2020-04-17  33MICRONS   28.80   29.00   26.80   28.75   81116

I want to check the volume rise daily for every symbol.

I came up with this:

def checkvol(tf):
    tf['vol'] = tf.Volume/tf.Volume.shift(1)
    return tf

df = df.groupby('Symbol').apply(checkvol)

Is there any faster alternative to it - I also apply other functions to my df sliced by symbol.

You can avoid df.apply like this:

In [158]: df['vol'] = df.Volume.div(df.groupby('Symbol')['Volume'].shift(1))

In [159]: df
Out[159]: 
    Datetimes     Symbol  Open   High    Low  Close  Volume       vol
0  2020-04-15  20MICRONS  26.6  31.40  25.60  27.85   75893       NaN
0  2020-04-16  20MICRONS  28.0  28.65  24.65  26.80   87254  1.149698
0  2020-04-17  20MICRONS  28.8  29.00  26.80  28.75   81116  0.929654
0  2020-04-15  33MICRONS  26.6  31.40  25.60  27.85   75893       NaN
0  2020-04-16  33MICRONS  28.0  28.65  24.65  26.80   87254  1.149698
0  2020-04-17  33MICRONS  28.8  29.00  26.80  28.75   81116  0.929654

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM