Faster way to filter pandas dataframe and create new columns

Question

Given df

   ticker  close   open
0    AAPL    1.2    1.1
1    TSLA   25.0   27.0
2    TSLA   83.0   80.0
3    TSLA   95.0   93.0
4    CCL   234.0  234.2
5    AAPL  512.0  520.0

My purpose:

(1) Apply functions to each ticker dataframe (subset)

(2) Create new column with values in string like 'exist' to each ticker dataframe

My expected output

   ticker  close   open  candlestick   SMA_20     SMA_50
0    AAPL    1.2    1.1  bullish      (number)   (number)
1    TSLA   25.0   27.0  bearish      (number)   (number)
2    TSLA   83.0   80.0  bullish      (number)   (number)
3    TSLA   95.0   93.0  bullish      (number)   (number)
4    CCL   234.0  234.2  bearish      (number)   (number)
5    AAPL  512.0  520.0  bearish      (number)   (number)

I've tried this code, which is extremely slow

for x in df.ticker:
    df_ticker = df[df.ticker == x]
    df_close_price = pd.DataFrame(df_ticker.close)
    for days in [20,50]:
       df_ticker[f'SMA_{days}'] = df_close_price.apply(lambda c: abstract.SMA(c, days))
    ......
    df_result = df_result.append(df_ticker)

I was wondering how to filter the dataframe by ticker in a faster way when dealing with millions rows. Many suggested using .loc , numpy , but I could not find a possible way to perform.

Thanks!

Answer 1

I think you need numpy.where :

df['candlestick'] = np.where(df['close'] > df['open'], 'bullish', 'bearish')
print (df)
  ticker  close   open candlestick
0   AAPL    1.2    1.1     bullish
1   TSLA   25.0   27.0     bearish
2   TSLA   83.0   80.0     bullish
3   TSLA   95.0   93.0     bullish
4    CCL  234.0  234.2     bearish
5   AAPL  512.0  520.0     bearish

EDIT: Here is possible use GroupBy.apply with custom functionand mainly pass Series to abstract.SMA instead .apply(lambda c: abstract.SMA(c, days) :

def f(x):
    for days in [20,50]:
        x[f'SMA_{days}'] = abstract.SMA(x.close, days)
    return x  
    
df = df.groupby('ticker')['close'].apply(f)
print (df)

Faster way to filter pandas dataframe and create new columns

Question

1 answers

solution1
1 2021-01-08 05:51:06

Faster way to filter pandas dataframe and create new columns

Question

1 answers

solution1 1 2021-01-08 05:51:06

solution1
1 2021-01-08 05:51:06