[英]Condition check based on Pandas rolling window
Based on the reproducable code below I have the following dataframe:根据下面的可重现代码,我有以下 dataframe:
df.tail(10)
open high low close 200MA 20MA Trend
date
2020-12-24 1.3273 1.3384 1.3257 1.3384 1.324826 1.325365 Up
2020-12-25 1.3408 1.3408 1.3268 1.3268 1.324926 1.326240 Up
2020-12-26 1.3268 1.3283 1.3217 1.3239 1.325008 1.326085 Up
2020-12-27 1.3240 1.3240 1.3078 1.3078 1.325009 1.325215 Up
2020-12-28 1.3103 1.3103 1.2878 1.2878 1.324973 1.323490 Down
2020-12-29 1.2893 1.2932 1.2876 1.2886 1.324951 1.321890 Down
2020-12-30 1.2871 1.2937 1.2810 1.2906 1.324923 1.319755 Down
2020-12-31 1.2905 1.3020 1.2905 1.2993 1.324934 1.318450 Down
2021-01-01 1.3006 1.3022 1.2896 1.2905 1.324893 1.316830 Down
2021-01-02 1.2909 1.3085 1.2890 1.3008 1.324908 1.315660 Down
I want a new column based on the following conditions:我想要一个基于以下条件的新列:
Desired output:所需的 output:
I have tried using the rolling window functionality in Pandas but struggled to get the required results.我曾尝试在 Pandas 中使用滚动 window 功能,但难以获得所需的结果。
Code to reproduce the example:重现示例的代码:
import pandas as pd
import numpy as np
def genMockDataFrame(days,startPrice,colName,startDate,seed=None):
periods = days*24
np.random.seed(seed)
steps = np.random.normal(loc=0, scale=0.0018, size=periods)
steps[0]=0
P = startPrice+np.cumsum(steps)
P = [round(i,4) for i in P]
fxDF = pd.DataFrame({
'ticker':np.repeat( [colName], periods ),
'date':np.tile( pd.date_range(startDate, periods=periods, freq='H'), 1 ),
'price':(P)})
fxDF.index = pd.to_datetime(fxDF.date)
fxDF = fxDF.price.resample('D').ohlc()
return fxDF
df = genMockDataFrame(290,1.1904,'eurusd','19/3/2020',seed=1)
df["200MA"] = df["close"].rolling(window=200).mean()
df["20MA"] = df["close"].rolling(window=20).mean()
df.loc[df['20MA'] > df['200MA'], "Trend"] = "Up"
df.loc[df['20MA'] < df['200MA'], "Trend"] = "Down"
You can use boolean mask created by comparing df['20MA']
with df['200MA']
using .gt()
and .lt()
and check the results within rolling window with .rolling()
by checking .sum()
of number of rows fulfilling condition within rolling windows being >=1 with .ge(1)
.您可以使用通过使用.gt()
和.lt()
将df['20MA']
与df['200MA']
进行比较而创建的 boolean 掩码,并通过检查.sum()
) 来检查使用.rolling()
滚动 window 的结果在滚动 windows 中满足条件的行数 >=1 且.ge(1)
。 Then use .loc()
on the mask to assign 'Neutral' to new column for the matching rows, as follows:然后在掩码上使用.loc()
将“中性”分配给匹配行的新列,如下所示:
df['Trend 20 Window'] = '' # init to ''
periods = 20
mask = df['20MA'].gt(df['200MA']).rolling(periods).sum().ge(1) & df['20MA'].lt(df['200MA']).rolling(periods).sum().ge(1)
df.loc[mask, 'Trend 20 Window'] = 'Neutral'
Let's test it with your sample data (10 rows) with a smaller rolling window of 3 :让我们用您的样本数据(10 行)用较小的滚动 window 3来测试它:
df['Trend 20 Window'] = ''
periods = 3
mask = df['20MA'].gt(df['200MA']).rolling(periods).sum().ge(1) & df['20MA'].lt(df['200MA']).rolling(periods).sum().ge(1)
df.loc[mask, 'Trend 20 Window'] = 'Neutral'
Result:结果:
open high low close 200MA 20MA Trend Trend 20 Window
date
2020-12-24 1.3273 1.3384 1.3257 1.3384 1.324826 1.325365 Up
2020-12-25 1.3408 1.3408 1.3268 1.3268 1.324926 1.326240 Up
2020-12-26 1.3268 1.3283 1.3217 1.3239 1.325008 1.326085 Up
2020-12-27 1.3240 1.3240 1.3078 1.3078 1.325009 1.325215 Up
2020-12-28 1.3103 1.3103 1.2878 1.2878 1.324973 1.323490 Down Neutral
2020-12-29 1.2893 1.2932 1.2876 1.2886 1.324951 1.321890 Down Neutral
2020-12-30 1.2871 1.2937 1.2810 1.2906 1.324923 1.319755 Down
2020-12-31 1.2905 1.3020 1.2905 1.2993 1.324934 1.318450 Down
2021-01-01 1.3006 1.3022 1.2896 1.2905 1.324893 1.316830 Down
2021-01-02 1.2909 1.3085 1.2890 1.3008 1.324908 1.315660 Down
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.