[英]Function to modify row values using df.apply or similar in Pandas
概述
我正在研究 dataframe ,其中df["Pivots"]
會在 1 和 -1 之間交替,只要之前由鋸齒形指標識別出高點或低點。
我正在嘗試在 dataframe 上使用 Pandas 實現以下操作,並在df["Pivots"]
先前(錯誤地)被分配一個值1
時修改相關行,這標志着高,但另一行實際上具有更高的High
值。
請參閱下面的屏幕截圖,了解數據的直觀表示和所需的 output。
偽代碼
如果當前行在df["Pivots"]
中有 -1
rows_between = index < 當前行和索引 > df["Pivots"]
中的最后一個 pivot 值,這將是 1
如果df.High
in rows_between > df["Pivot Price"]
在當前行中,則 actual_high 為df[High].max()
in rows_between。
從當前行的 df["Pivots" df["Pivots"]
和 df["Pivot Price" df["Pivot Price"]
中刪除 1 並將其添加到實際高的行中的df["Pivots"]
和df["Pivot Price"]
例子
在此示例中, df.High
行中的2023-10-08
為實際高,並且高於第2023-09-24
行中的df["Pivot Price"]
。
這是原裝 dataframe。
這是所需的 output :
實際的 dataframe 將包含許多行,這只是一個最小的、可重現的示例。
代碼
df.to_dict()
{'Open': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): 1.0427,
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): 1.0586,
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): 1.0314,
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): 1.0669,
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): 1.0058,
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): 0.9966},
'High': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): 1.0621,
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): 1.0609,
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): 1.0714,
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): 1.0679,
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): 1.0198,
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): 0.9966},
'Low': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): 1.0383,
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): 1.0297,
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): 1.0285,
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): 1.004,
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): 0.9941,
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): 0.938},
'Close': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): 1.0577,
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): 1.0297,
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): 1.0666,
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): 1.0053,
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): 0.9988,
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): 0.9528},
'Pivots': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): 1,
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): 0,
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): 0,
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): 0,
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): 0,
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): -1},
'Pivot Price': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): 1.0621,
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): nan,
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): nan,
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): nan,
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): nan,
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): 0.938},
'Date': {Timestamp('2023-09-24 00:00:00', freq='W-SUN'): Timestamp('2023-09-24 00:00:00'),
Timestamp('2023-10-01 00:00:00', freq='W-SUN'): Timestamp('2023-10-01 00:00:00'),
Timestamp('2023-10-08 00:00:00', freq='W-SUN'): Timestamp('2023-10-08 00:00:00'),
Timestamp('2023-10-15 00:00:00', freq='W-SUN'): Timestamp('2023-10-15 00:00:00'),
Timestamp('2023-10-22 00:00:00', freq='W-SUN'): Timestamp('2023-10-22 00:00:00'),
Timestamp('2023-10-29 00:00:00', freq='W-SUN'): Timestamp('2023-10-29 00:00:00')}}
作為參考, 這是生成這些樞軸的代碼。
我想不出使用.apply()
的簡短解決方案,但使用一些輔助功能,您可以使用以下代碼解決問題:
import numpy as np
def get_highs_idx(df):
return df[df['Pivots'] == 1].index.tolist()
def get_lows_idx(df):
return df[df['Pivots'] == -1].index.tolist()
def get_previous_high_idx(df, low_idx):
highs_idx = get_highs_idx(df)
for high_idx in reversed(highs_idx):
if high_idx < low_idx:
return high_idx
return None
def reset_pivot(df, old_high_idx, new_high_idx):
df.loc[old_high_idx, 'Pivots'] = 0
df.loc[old_high_idx, 'Pivot Price'] = np.nan
df.loc[new_high_idx, 'Pivots'] = 1
df.loc[new_high_idx, 'Pivot Price'] = df.loc[new_high_idx, 'High']
def correct_highs(df):
lows_idx = get_lows_idx(df)
for low_idx in lows_idx:
high_idx = get_previous_high_idx(df, low_idx)
if high_idx is not None:
new_high_idx = df.loc[high_idx:low_idx, 'High'].idxmax()
if high_idx != new_high_idx:
reset_pivot(df, high_idx, new_high_idx)
correct_highs(df)
代碼可能會減少一點,但我認為這種方式更清晰。
根據您的評論,我也在代碼下方添加了糾正低點的代碼。
def get_previous_low_idx(df, high_idx):
lows_idx= get_lows_idx(df)
for low_idx in reversed(lows_idx):
if low_idx < high_idx:
return low_idx
return None
def reset_low_pivot(df, old_low_idx, new_low_idx):
df.loc[old_low_idx, 'Pivots'] = 0
df.loc[old_low_idx, 'Pivot Price'] = np.nan
df.loc[new_low_idx, 'Pivots'] = -1
df.loc[new_low_idx, 'Pivot Price'] = df.loc[new_low_idx, 'Low']
def correct_lows(df):
highs_idx = get_highs_idx(df)
for high_idx in highs_idx:
low_idx = get_previous_low_idx(df, high_idx)
if low_idx is not None:
new_low_idx = df.loc[low_idx:high_idx, 'Low'].idxmin()
if low_idx != new_low_idx:
reset_low_pivot(df, low_idx, new_low_idx)
correct_lows(df)
我不想影響原始答案,但您可能希望將reset_pivot
重命名為reset_high_pivot
以保持一致性。
還可以添加高級 function:
def correct_pivots(df):
correct_highs(df)
correct_lows(df)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.