I've made a script (shown below) that helps determine local maxima points using historical stock data. It uses the daily highs to mark out local resistance levels. Works great, but what I would like is, for any given point in time (or row in the stock data), I want to know what the most recent resistance level was just prior to that point. I want this in it's own column in the dataset. So for instance:
The top grey line is the highs for each day, and the bottom grey line was the close of each day. So roughly speaking, the dataset for that section would look like this:
High Close
216.8099976 216.3399963
215.1499939 213.2299957
214.6999969 213.1499939
215.7299957 215.2799988 <- First blue dot at high
213.6900024 213.3699951
214.8800049 213.4100037 <- 2nd blue dot at high
214.5899963 213.4199982
216.0299988 215.8200073
217.5299988 217.1799927 <- 3rd blue dot at high
216.8800049 215.9900055
215.2299957 214.2400055
215.6799927 215.5700073
....
Right now, this script looks at the entire dataset at once to determine the local maxima indexes for the highs, and then for any given point in the stock history (ie any given row), it looks for the NEXT maxima in the list of all maximas found. This would be a way to determine where the next resistance level is, but I don't want that due to a look ahead bias. I just want to have a column of the most recent past resistance level or maybe even the latest 2 recent points in 2 columns. That would be ideal actually.
So my final output would look like this for the 1 column:
High Close Most_Rec_Max
216.8099976 216.3399963 0
215.1499939 213.2299957 0
214.6999969 213.1499939 0
215.7299957 215.2799988 0
213.6900024 213.3699951 215.7299957
214.8800049 213.4100037 215.7299957
214.5899963 213.4199982 214.8800049
216.0299988 215.8200073 214.8800049
217.5299988 217.1799927 214.8800049
216.8800049 215.9900055 217.5299988
215.2299957 214.2400055 217.5299988
215.6799927 215.5700073 217.5299988
....
You'll notice that the dot only shows up in most recent column after it has already been discovered.
Here is the code I am using:
real_close_prices = df['Close'].to_numpy()
highs = df['High'].to_numpy()
max_indexes = (np.diff(np.sign(np.diff(highs))) < 0).nonzero()[0] + 1 # local max
# +1 due to the fact that diff reduces the original index number
max_values_at_indexes = highs[max_indexes]
curr_high = [c for c in highs]
max_values_at_indexes.sort()
for m in max_values_at_indexes:
for i, c in enumerate(highs):
if m > c and curr_high[i] == c:
curr_high[i] = m
#print(nextbig)
df['High_Resistance'] = curr_high
# plot
plt.figure(figsize=(12, 5))
plt.plot(x, highs, color='grey')
plt.plot(x, real_close_prices, color='grey')
plt.plot(x[max_indexes], highs[max_indexes], "o", label="max", color='b')
plt.show()
Hoping someone will be able to help me out with this. Thanks!
Here is one approach. Once you know where the peaks are, you can store peak indices in p_ids
and peak values in p_vals
. To assign the k
'th most recent peak, note that p_vals[:-k]
will occur at p_ids[k:]
. The rest is forward filling.
# find all local maxima in the series by comparing to shifted values
peaks = (df.High > df.High.shift(1)) & (df.High > df.High.shift(-1))
# pass peak value if peak is achieved and NaN otherwise
# forward fill with previous peak value & handle leading NaNs with fillna
df['Most_Rec_Max'] = (df.High * peaks.replace(False, np.nan)).ffill().fillna(0)
# for finding n-most recent peak
p_ids, = np.where(peaks)
p_vals = df.High[p_ids].values
for n in [1,2]:
col_name = f'{n+1}_Most_Rec_Max'
df[col_name] = np.nan
df.loc[p_ids[n:], col_name] = p_vals[:-n]
df[col_name].ffill(inplace=True)
df[col_name].fillna(0, inplace=True)
# High Close Most_Rec_Max 2_Most_Rec_Max 3_Most_Rec_Max
# 0 216.809998 216.339996 0.000000 0.000000 0.000000
# 1 215.149994 213.229996 0.000000 0.000000 0.000000
# 2 214.699997 213.149994 0.000000 0.000000 0.000000
# 3 215.729996 215.279999 215.729996 0.000000 0.000000
# 4 213.690002 213.369995 215.729996 0.000000 0.000000
# 5 214.880005 213.410004 214.880005 215.729996 0.000000
# 6 214.589996 213.419998 214.880005 215.729996 0.000000
# 7 216.029999 215.820007 214.880005 215.729996 0.000000
# 8 217.529999 217.179993 217.529999 214.880005 215.729996
# 9 216.880005 215.990006 217.529999 214.880005 215.729996
# 10 215.229996 214.240006 217.529999 214.880005 215.729996
# 11 215.679993 215.570007 217.529999 214.880005 215.729996
I just came across this function that might help you a lot: scipy.signal.find_peaks .
Based on your sample dataframe, we can do the following:
from scipy.signal import find_peaks
## Grab the minimum high value as a threshold.
min_high = df["High"].min()
### Run the High values through the function. The docs explain more,
### but we can set our height to the minimum high value.
### We just need one out of two return values.
peaks, _ = find_peaks(df["High"], height=min_high)
### Do some maintenance and add a column to mark peaks
# Merge on our index values
df1 = df.merge(peaks_df, how="left", left_index=True, right_index=True)
# Set non-null values to 1 and null values to 0; Convert column to integer type.
df1.loc[~df1["local_high"].isna(), "local_high"] = 1
df1.loc[df1["local_high"].isna(), "local_high"] = 0
df1["local_high"] = df1["local_high"].astype(int)
Then, your dataframe should look like the following:
High Low local_high
0 216.809998 216.339996 0
1 215.149994 213.229996 0
2 214.699997 213.149994 0
3 215.729996 215.279999 1
4 213.690002 213.369995 0
5 214.880005 213.410004 1
6 214.589996 213.419998 0
7 216.029999 215.820007 0
8 217.529999 217.179993 1
9 216.880005 215.990005 0
10 215.229996 214.240005 0
11 215.679993 215.570007 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.