局部最小值/最大值的滚动窗口

Question

I've made a script (shown below) that helps determine local maxima points using historical stock data.我制作了一个脚本（如下所示），可帮助使用历史股票数据确定局部最大值点。 It uses the daily highs to mark out local resistance levels.它使用每日高点来标记当地的阻力水平。 Works great, but what I would like is, for any given point in time (or row in the stock data), I want to know what the most recent resistance level was just prior to that point.效果很好，但我想要的是，对于任何给定的时间点（或股票数据中的行），我想知道在该点之前的最近阻力位是多少。 I want this in it's own column in the dataset.我希望它在数据集中它自己的列中。 So for instance:所以例如：

The top grey line is the highs for each day, and the bottom grey line was the close of each day.顶部的灰线是每天的高点，底部的灰线是每天的收盘价。 So roughly speaking, the dataset for that section would look like this:粗略地说，该部分的数据集如下所示：

High            Close
216.8099976     216.3399963
215.1499939     213.2299957
214.6999969     213.1499939
215.7299957     215.2799988 <- First blue dot at high
213.6900024     213.3699951
214.8800049     213.4100037 <- 2nd blue dot at high
214.5899963     213.4199982 
216.0299988     215.8200073
217.5299988     217.1799927 <- 3rd blue dot at high
216.8800049     215.9900055
215.2299957     214.2400055
215.6799927     215.5700073
....

Right now, this script looks at the entire dataset at once to determine the local maxima indexes for the highs, and then for any given point in the stock history (ie any given row), it looks for the NEXT maxima in the list of all maximas found.现在，这个脚本一次查看整个数据集以确定高点的局部最大值索引，然后对于股票历史中的任何给定点（即任何给定行），它在所有列表中查找 NEXT 最大值发现最大值。 This would be a way to determine where the next resistance level is, but I don't want that due to a look ahead bias.这将是一种确定下一个阻力位在哪里的方法，但由于前瞻性偏见，我不希望这样做。 I just want to have a column of the most recent past resistance level or maybe even the latest 2 recent points in 2 columns.我只想有一个最近的过去阻力位的列，或者甚至是 2 列中最近的 2 个最近点。 That would be ideal actually.这实际上是理想的。

So my final output would look like this for the 1 column:因此，对于 1 列，我的最终输出将如下所示：

High            Close           Most_Rec_Max
216.8099976     216.3399963     0
215.1499939     213.2299957     0
214.6999969     213.1499939     0
215.7299957     215.2799988     0
213.6900024     213.3699951     215.7299957
214.8800049     213.4100037     215.7299957
214.5899963     213.4199982     214.8800049
216.0299988     215.8200073     214.8800049
217.5299988     217.1799927     214.8800049
216.8800049     215.9900055     217.5299988
215.2299957     214.2400055     217.5299988
215.6799927     215.5700073     217.5299988
....

You'll notice that the dot only shows up in most recent column after it has already been discovered.您会注意到该点仅在被发现后才显示在最近的列中。

Here is the code I am using:这是我正在使用的代码：

real_close_prices = df['Close'].to_numpy()

highs = df['High'].to_numpy()

max_indexes = (np.diff(np.sign(np.diff(highs))) < 0).nonzero()[0] + 1 # local max
# +1 due to the fact that diff reduces the original index number

max_values_at_indexes = highs[max_indexes]
curr_high = [c for c in highs]
max_values_at_indexes.sort()
for m in max_values_at_indexes:
    for i, c in enumerate(highs):
        if m > c and curr_high[i] == c:
            curr_high[i] = m
#print(nextbig)
df['High_Resistance'] = curr_high

# plot
plt.figure(figsize=(12, 5))
plt.plot(x, highs, color='grey')
plt.plot(x, real_close_prices, color='grey')
plt.plot(x[max_indexes], highs[max_indexes], "o", label="max", color='b')
plt.show()

Hoping someone will be able to help me out with this.希望有人能够帮助我解决这个问题。 Thanks!谢谢！

Answer 1

Here is one approach.这是一种方法。 Once you know where the peaks are, you can store peak indices in p_ids and peak values in p_vals .一旦知道峰值在哪里，您就可以将峰值索引存储在p_ids ，将峰值存储在p_vals 。 To assign the k 'th most recent peak, note that p_vals[:-k] will occur at p_ids[k:] .要分配第k个最近的峰值，请注意p_vals[:-k]将出现在p_ids[k:] 。 The rest is forward filling.其余的是向前填充。

# find all local maxima in the series by comparing to shifted values
peaks = (df.High > df.High.shift(1)) & (df.High > df.High.shift(-1))
# pass peak value if peak is achieved and NaN otherwise
# forward fill with previous peak value & handle leading NaNs with fillna
df['Most_Rec_Max'] = (df.High * peaks.replace(False, np.nan)).ffill().fillna(0)

# for finding n-most recent peak
p_ids, = np.where(peaks)
p_vals = df.High[p_ids].values
for n in [1,2]:
  col_name = f'{n+1}_Most_Rec_Max'
  df[col_name] = np.nan
  df.loc[p_ids[n:], col_name] = p_vals[:-n]
  df[col_name].ffill(inplace=True)
  df[col_name].fillna(0, inplace=True)


#           High       Close  Most_Rec_Max  2_Most_Rec_Max  3_Most_Rec_Max
# 0   216.809998  216.339996      0.000000        0.000000        0.000000
# 1   215.149994  213.229996      0.000000        0.000000        0.000000
# 2   214.699997  213.149994      0.000000        0.000000        0.000000
# 3   215.729996  215.279999    215.729996        0.000000        0.000000
# 4   213.690002  213.369995    215.729996        0.000000        0.000000
# 5   214.880005  213.410004    214.880005      215.729996        0.000000
# 6   214.589996  213.419998    214.880005      215.729996        0.000000
# 7   216.029999  215.820007    214.880005      215.729996        0.000000
# 8   217.529999  217.179993    217.529999      214.880005      215.729996
# 9   216.880005  215.990006    217.529999      214.880005      215.729996
# 10  215.229996  214.240006    217.529999      214.880005      215.729996
# 11  215.679993  215.570007    217.529999      214.880005      215.729996

Answer 2

I just came across this function that might help you a lot: scipy.signal.find_peaks .我刚刚遇到了这个可能对你有很大帮助的函数： scipy.signal.find_peaks 。

Based on your sample dataframe, we can do the following:根据您的示例数据框，我们可以执行以下操作：

from scipy.signal import find_peaks

## Grab the minimum high value as a threshold.
min_high = df["High"].min()

### Run the High values through the function. The docs explain more,
### but we can set our height to the minimum high value.
### We just need one out of two return values.

peaks, _ = find_peaks(df["High"], height=min_high)

### Do some maintenance and add a column to mark peaks

# Merge on our index values
df1 = df.merge(peaks_df, how="left", left_index=True, right_index=True)

# Set non-null values to 1 and null values to 0; Convert column to integer type.
df1.loc[~df1["local_high"].isna(), "local_high"] = 1
df1.loc[df1["local_high"].isna(), "local_high"] = 0
df1["local_high"] = df1["local_high"].astype(int)

Then, your dataframe should look like the following:然后，您的数据框应如下所示：

          High         Low  local_high
0   216.809998  216.339996           0
1   215.149994  213.229996           0
2   214.699997  213.149994           0
3   215.729996  215.279999           1
4   213.690002  213.369995           0
5   214.880005  213.410004           1
6   214.589996  213.419998           0
7   216.029999  215.820007           0
8   217.529999  217.179993           1
9   216.880005  215.990005           0
10  215.229996  214.240005           0
11  215.679993  215.570007           0

局部最小值/最大值的滚动窗口

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-03-10 01:22:54

解决方案2
1 2020-03-10 02:14:03

局部最小值/最大值的滚动窗口

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-03-10 01:22:54

解决方案2 1 2020-03-10 02:14:03

解决方案1
1 已采纳 2020-03-10 01:22:54

解决方案2
1 2020-03-10 02:14:03