局部最小值/最大值的滾動窗口

Question

我制作了一個腳本（如下所示），可幫助使用歷史股票數據確定局部最大值點。 它使用每日高點來標記當地的阻力水平。 效果很好，但我想要的是，對於任何給定的時間點（或股票數據中的行），我想知道在該點之前的最近阻力位是多少。 我希望它在數據集中它自己的列中。 所以例如：

頂部的灰線是每天的高點，底部的灰線是每天的收盤價。 粗略地說，該部分的數據集如下所示：

High            Close
216.8099976     216.3399963
215.1499939     213.2299957
214.6999969     213.1499939
215.7299957     215.2799988 <- First blue dot at high
213.6900024     213.3699951
214.8800049     213.4100037 <- 2nd blue dot at high
214.5899963     213.4199982 
216.0299988     215.8200073
217.5299988     217.1799927 <- 3rd blue dot at high
216.8800049     215.9900055
215.2299957     214.2400055
215.6799927     215.5700073
....

現在，這個腳本一次查看整個數據集以確定高點的局部最大值索引，然后對於股票歷史中的任何給定點（即任何給定行），它在所有列表中查找 NEXT 最大值發現最大值。 這將是一種確定下一個阻力位在哪里的方法，但由於前瞻性偏見，我不希望這樣做。 我只想有一個最近的過去阻力位的列，或者甚至是 2 列中最近的 2 個最近點。 這實際上是理想的。

因此，對於 1 列，我的最終輸出將如下所示：

High            Close           Most_Rec_Max
216.8099976     216.3399963     0
215.1499939     213.2299957     0
214.6999969     213.1499939     0
215.7299957     215.2799988     0
213.6900024     213.3699951     215.7299957
214.8800049     213.4100037     215.7299957
214.5899963     213.4199982     214.8800049
216.0299988     215.8200073     214.8800049
217.5299988     217.1799927     214.8800049
216.8800049     215.9900055     217.5299988
215.2299957     214.2400055     217.5299988
215.6799927     215.5700073     217.5299988
....

您會注意到該點僅在被發現后才顯示在最近的列中。

這是我正在使用的代碼：

real_close_prices = df['Close'].to_numpy()

highs = df['High'].to_numpy()

max_indexes = (np.diff(np.sign(np.diff(highs))) < 0).nonzero()[0] + 1 # local max
# +1 due to the fact that diff reduces the original index number

max_values_at_indexes = highs[max_indexes]
curr_high = [c for c in highs]
max_values_at_indexes.sort()
for m in max_values_at_indexes:
    for i, c in enumerate(highs):
        if m > c and curr_high[i] == c:
            curr_high[i] = m
#print(nextbig)
df['High_Resistance'] = curr_high

# plot
plt.figure(figsize=(12, 5))
plt.plot(x, highs, color='grey')
plt.plot(x, real_close_prices, color='grey')
plt.plot(x[max_indexes], highs[max_indexes], "o", label="max", color='b')
plt.show()

希望有人能夠幫助我解決這個問題。 謝謝！

Answer 1

這是一種方法。 一旦知道峰值在哪里，您就可以將峰值索引存儲在p_ids ，將峰值存儲在p_vals 。 要分配第k個最近的峰值，請注意p_vals[:-k]將出現在p_ids[k:] 。 其余的是向前填充。

# find all local maxima in the series by comparing to shifted values
peaks = (df.High > df.High.shift(1)) & (df.High > df.High.shift(-1))
# pass peak value if peak is achieved and NaN otherwise
# forward fill with previous peak value & handle leading NaNs with fillna
df['Most_Rec_Max'] = (df.High * peaks.replace(False, np.nan)).ffill().fillna(0)

# for finding n-most recent peak
p_ids, = np.where(peaks)
p_vals = df.High[p_ids].values
for n in [1,2]:
  col_name = f'{n+1}_Most_Rec_Max'
  df[col_name] = np.nan
  df.loc[p_ids[n:], col_name] = p_vals[:-n]
  df[col_name].ffill(inplace=True)
  df[col_name].fillna(0, inplace=True)


#           High       Close  Most_Rec_Max  2_Most_Rec_Max  3_Most_Rec_Max
# 0   216.809998  216.339996      0.000000        0.000000        0.000000
# 1   215.149994  213.229996      0.000000        0.000000        0.000000
# 2   214.699997  213.149994      0.000000        0.000000        0.000000
# 3   215.729996  215.279999    215.729996        0.000000        0.000000
# 4   213.690002  213.369995    215.729996        0.000000        0.000000
# 5   214.880005  213.410004    214.880005      215.729996        0.000000
# 6   214.589996  213.419998    214.880005      215.729996        0.000000
# 7   216.029999  215.820007    214.880005      215.729996        0.000000
# 8   217.529999  217.179993    217.529999      214.880005      215.729996
# 9   216.880005  215.990006    217.529999      214.880005      215.729996
# 10  215.229996  214.240006    217.529999      214.880005      215.729996
# 11  215.679993  215.570007    217.529999      214.880005      215.729996

Answer 2

我剛剛遇到了這個可能對你有很大幫助的函數： scipy.signal.find_peaks 。

根據您的示例數據框，我們可以執行以下操作：

from scipy.signal import find_peaks

## Grab the minimum high value as a threshold.
min_high = df["High"].min()

### Run the High values through the function. The docs explain more,
### but we can set our height to the minimum high value.
### We just need one out of two return values.

peaks, _ = find_peaks(df["High"], height=min_high)

### Do some maintenance and add a column to mark peaks

# Merge on our index values
df1 = df.merge(peaks_df, how="left", left_index=True, right_index=True)

# Set non-null values to 1 and null values to 0; Convert column to integer type.
df1.loc[~df1["local_high"].isna(), "local_high"] = 1
df1.loc[df1["local_high"].isna(), "local_high"] = 0
df1["local_high"] = df1["local_high"].astype(int)

然后，您的數據框應如下所示：

          High         Low  local_high
0   216.809998  216.339996           0
1   215.149994  213.229996           0
2   214.699997  213.149994           0
3   215.729996  215.279999           1
4   213.690002  213.369995           0
5   214.880005  213.410004           1
6   214.589996  213.419998           0
7   216.029999  215.820007           0
8   217.529999  217.179993           1
9   216.880005  215.990005           0
10  215.229996  214.240005           0
11  215.679993  215.570007           0

局部最小值/最大值的滾動窗口

問題描述

2 個解決方案

解決方案1
1 已采納 2020-03-10 01:22:54

解決方案2
1 2020-03-10 02:14:03

局部最小值/最大值的滾動窗口

問題描述

2 個解決方案

解決方案1 1 已采納 2020-03-10 01:22:54

解決方案2 1 2020-03-10 02:14:03

解決方案1
1 已采納 2020-03-10 01:22:54

解決方案2
1 2020-03-10 02:14:03