简体   繁体   中英

Faster way to filter pandas dataframe and create new columns

Given df

   ticker  close   open
0    AAPL    1.2    1.1
1    TSLA   25.0   27.0
2    TSLA   83.0   80.0
3    TSLA   95.0   93.0
4    CCL   234.0  234.2
5    AAPL  512.0  520.0

My purpose:

(1) Apply functions to each ticker dataframe (subset)

(2) Create new column with values in string like 'exist' to each ticker dataframe

My expected output

   ticker  close   open  candlestick   SMA_20     SMA_50
0    AAPL    1.2    1.1  bullish      (number)   (number)
1    TSLA   25.0   27.0  bearish      (number)   (number)
2    TSLA   83.0   80.0  bullish      (number)   (number)
3    TSLA   95.0   93.0  bullish      (number)   (number)
4    CCL   234.0  234.2  bearish      (number)   (number)
5    AAPL  512.0  520.0  bearish      (number)   (number)

I've tried this code, which is extremely slow

for x in df.ticker:
    df_ticker = df[df.ticker == x]
    df_close_price = pd.DataFrame(df_ticker.close)
    for days in [20,50]:
       df_ticker[f'SMA_{days}'] = df_close_price.apply(lambda c: abstract.SMA(c, days))
    ......
    df_result = df_result.append(df_ticker)

I was wondering how to filter the dataframe by ticker in a faster way when dealing with millions rows. Many suggested using .loc , numpy , but I could not find a possible way to perform.

Thanks!

I think you need numpy.where :

df['candlestick'] = np.where(df['close'] > df['open'], 'bullish', 'bearish')
print (df)
  ticker  close   open candlestick
0   AAPL    1.2    1.1     bullish
1   TSLA   25.0   27.0     bearish
2   TSLA   83.0   80.0     bullish
3   TSLA   95.0   93.0     bullish
4    CCL  234.0  234.2     bearish
5   AAPL  512.0  520.0     bearish

EDIT: Here is possible use GroupBy.apply with custom functionand mainly pass Series to abstract.SMA instead .apply(lambda c: abstract.SMA(c, days) :

def f(x):
    for days in [20,50]:
        x[f'SMA_{days}'] = abstract.SMA(x.close, days)
    return x  
    
df = df.groupby('ticker')['close'].apply(f)
print (df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM