局部最小值/最大值的滚动窗口

Question

我制作了一个脚本（如下所示），可帮助使用历史股票数据确定局部最大值点。 它使用每日高点来标记当地的阻力水平。 效果很好，但我想要的是，对于任何给定的时间点（或股票数据中的行），我想知道在该点之前的最近阻力位是多少。 我希望它在数据集中它自己的列中。 所以例如：

顶部的灰线是每天的高点，底部的灰线是每天的收盘价。 粗略地说，该部分的数据集如下所示：

High            Close
216.8099976     216.3399963
215.1499939     213.2299957
214.6999969     213.1499939
215.7299957     215.2799988 <- First blue dot at high
213.6900024     213.3699951
214.8800049     213.4100037 <- 2nd blue dot at high
214.5899963     213.4199982 
216.0299988     215.8200073
217.5299988     217.1799927 <- 3rd blue dot at high
216.8800049     215.9900055
215.2299957     214.2400055
215.6799927     215.5700073
....

现在，这个脚本一次查看整个数据集以确定高点的局部最大值索引，然后对于股票历史中的任何给定点（即任何给定行），它在所有列表中查找 NEXT 最大值发现最大值。 这将是一种确定下一个阻力位在哪里的方法，但由于前瞻性偏见，我不希望这样做。 我只想有一个最近的过去阻力位的列，或者甚至是 2 列中最近的 2 个最近点。 这实际上是理想的。

因此，对于 1 列，我的最终输出将如下所示：

High            Close           Most_Rec_Max
216.8099976     216.3399963     0
215.1499939     213.2299957     0
214.6999969     213.1499939     0
215.7299957     215.2799988     0
213.6900024     213.3699951     215.7299957
214.8800049     213.4100037     215.7299957
214.5899963     213.4199982     214.8800049
216.0299988     215.8200073     214.8800049
217.5299988     217.1799927     214.8800049
216.8800049     215.9900055     217.5299988
215.2299957     214.2400055     217.5299988
215.6799927     215.5700073     217.5299988
....

您会注意到该点仅在被发现后才显示在最近的列中。

这是我正在使用的代码：

real_close_prices = df['Close'].to_numpy()

highs = df['High'].to_numpy()

max_indexes = (np.diff(np.sign(np.diff(highs))) < 0).nonzero()[0] + 1 # local max
# +1 due to the fact that diff reduces the original index number

max_values_at_indexes = highs[max_indexes]
curr_high = [c for c in highs]
max_values_at_indexes.sort()
for m in max_values_at_indexes:
    for i, c in enumerate(highs):
        if m > c and curr_high[i] == c:
            curr_high[i] = m
#print(nextbig)
df['High_Resistance'] = curr_high

# plot
plt.figure(figsize=(12, 5))
plt.plot(x, highs, color='grey')
plt.plot(x, real_close_prices, color='grey')
plt.plot(x[max_indexes], highs[max_indexes], "o", label="max", color='b')
plt.show()

希望有人能够帮助我解决这个问题。 谢谢！

Answer 1

这是一种方法。 一旦知道峰值在哪里，您就可以将峰值索引存储在p_ids ，将峰值存储在p_vals 。 要分配第k个最近的峰值，请注意p_vals[:-k]将出现在p_ids[k:] 。 其余的是向前填充。

# find all local maxima in the series by comparing to shifted values
peaks = (df.High > df.High.shift(1)) & (df.High > df.High.shift(-1))
# pass peak value if peak is achieved and NaN otherwise
# forward fill with previous peak value & handle leading NaNs with fillna
df['Most_Rec_Max'] = (df.High * peaks.replace(False, np.nan)).ffill().fillna(0)

# for finding n-most recent peak
p_ids, = np.where(peaks)
p_vals = df.High[p_ids].values
for n in [1,2]:
  col_name = f'{n+1}_Most_Rec_Max'
  df[col_name] = np.nan
  df.loc[p_ids[n:], col_name] = p_vals[:-n]
  df[col_name].ffill(inplace=True)
  df[col_name].fillna(0, inplace=True)


#           High       Close  Most_Rec_Max  2_Most_Rec_Max  3_Most_Rec_Max
# 0   216.809998  216.339996      0.000000        0.000000        0.000000
# 1   215.149994  213.229996      0.000000        0.000000        0.000000
# 2   214.699997  213.149994      0.000000        0.000000        0.000000
# 3   215.729996  215.279999    215.729996        0.000000        0.000000
# 4   213.690002  213.369995    215.729996        0.000000        0.000000
# 5   214.880005  213.410004    214.880005      215.729996        0.000000
# 6   214.589996  213.419998    214.880005      215.729996        0.000000
# 7   216.029999  215.820007    214.880005      215.729996        0.000000
# 8   217.529999  217.179993    217.529999      214.880005      215.729996
# 9   216.880005  215.990006    217.529999      214.880005      215.729996
# 10  215.229996  214.240006    217.529999      214.880005      215.729996
# 11  215.679993  215.570007    217.529999      214.880005      215.729996

Answer 2

我刚刚遇到了这个可能对你有很大帮助的函数： scipy.signal.find_peaks 。

根据您的示例数据框，我们可以执行以下操作：

from scipy.signal import find_peaks

## Grab the minimum high value as a threshold.
min_high = df["High"].min()

### Run the High values through the function. The docs explain more,
### but we can set our height to the minimum high value.
### We just need one out of two return values.

peaks, _ = find_peaks(df["High"], height=min_high)

### Do some maintenance and add a column to mark peaks

# Merge on our index values
df1 = df.merge(peaks_df, how="left", left_index=True, right_index=True)

# Set non-null values to 1 and null values to 0; Convert column to integer type.
df1.loc[~df1["local_high"].isna(), "local_high"] = 1
df1.loc[df1["local_high"].isna(), "local_high"] = 0
df1["local_high"] = df1["local_high"].astype(int)

然后，您的数据框应如下所示：

          High         Low  local_high
0   216.809998  216.339996           0
1   215.149994  213.229996           0
2   214.699997  213.149994           0
3   215.729996  215.279999           1
4   213.690002  213.369995           0
5   214.880005  213.410004           1
6   214.589996  213.419998           0
7   216.029999  215.820007           0
8   217.529999  217.179993           1
9   216.880005  215.990005           0
10  215.229996  214.240005           0
11  215.679993  215.570007           0

局部最小值/最大值的滚动窗口

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-03-10 01:22:54

解决方案2
1 2020-03-10 02:14:03

局部最小值/最大值的滚动窗口

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-03-10 01:22:54

解决方案2 1 2020-03-10 02:14:03

解决方案1
1 已采纳 2020-03-10 01:22:54

解决方案2
1 2020-03-10 02:14:03