Given df
ticker close open
0 AAPL 1.2 1.1
1 TSLA 25.0 27.0
2 TSLA 83.0 80.0
3 TSLA 95.0 93.0
4 CCL 234.0 234.2
5 AAPL 512.0 520.0
My purpose:
(1) Apply functions to each ticker dataframe (subset)
(2) Create new column with values in string like 'exist' to each ticker dataframe
My expected output
ticker close open candlestick SMA_20 SMA_50
0 AAPL 1.2 1.1 bullish (number) (number)
1 TSLA 25.0 27.0 bearish (number) (number)
2 TSLA 83.0 80.0 bullish (number) (number)
3 TSLA 95.0 93.0 bullish (number) (number)
4 CCL 234.0 234.2 bearish (number) (number)
5 AAPL 512.0 520.0 bearish (number) (number)
I've tried this code, which is extremely slow
for x in df.ticker:
df_ticker = df[df.ticker == x]
df_close_price = pd.DataFrame(df_ticker.close)
for days in [20,50]:
df_ticker[f'SMA_{days}'] = df_close_price.apply(lambda c: abstract.SMA(c, days))
......
df_result = df_result.append(df_ticker)
I was wondering how to filter the dataframe by ticker in a faster way when dealing with millions rows. Many suggested using .loc
, numpy
, but I could not find a possible way to perform.
Thanks!
I think you need numpy.where
:
df['candlestick'] = np.where(df['close'] > df['open'], 'bullish', 'bearish')
print (df)
ticker close open candlestick
0 AAPL 1.2 1.1 bullish
1 TSLA 25.0 27.0 bearish
2 TSLA 83.0 80.0 bullish
3 TSLA 95.0 93.0 bullish
4 CCL 234.0 234.2 bearish
5 AAPL 512.0 520.0 bearish
EDIT: Here is possible use GroupBy.apply
with custom functionand mainly pass Series to abstract.SMA
instead .apply(lambda c: abstract.SMA(c, days)
:
def f(x):
for days in [20,50]:
x[f'SMA_{days}'] = abstract.SMA(x.close, days)
return x
df = df.groupby('ticker')['close'].apply(f)
print (df)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.