[英]How to flag rows where PREVIOUS three values are same in a DataFrame column?
My dataframe has the following format:我的数据框具有以下格式:
final_df=pd.DataFrame([{'Open':100,'Close':101,'Candle':1},{'Open':100,'Close':101,'Candle':1},{'Open':101,'Close':102,'Candle':1},{'Open':102,'Close':101,'Candle':0},{'Open':101,'Close':100,'Candle':0},{'Open':100,'Close':99,'Candle':0},{'Open':99,'Close':98,'Candle':0},{'Open':98,'Close':99,'Candle':1},{'Open':99,'Close':100,'Candle':1},{'Open':100,'Close':101,'Candle':1},{'Open':100,'Close':99,'Candle':0}])
I would like to create a column called pattern
which has the value 1
every time three or more values of the column 'candle' are the same.我想创建一个名为pattern
的列,每次“蜡烛”列的三个或更多值相同时,其值为1
。 I tried using the following code but this code flags entries in which 3 (or more) consequent values are the same, instead I want to flag the start of this pattern, ie.我尝试使用以下代码,但此代码标记其中 3 个(或更多)结果值相同的条目,而不是我想标记此模式的开始,即。 have 1
for the row with the third, fourth, ... candle.有1
代表第三、第四、……蜡烛的行。
final_df['pattern'] = final_df.Candle.groupby([final_df.Candle.diff().ne(0).cumsum()]).transform('size').ge(3).astype(int)
In this example, I want the rows with the index 2, 5, 6, and 9 to be flagged as these are the data points where the previous three rows have the same value.在此示例中,我希望标记索引为 2、5、6 和 9 的行,因为这些是前三行具有相同值的数据点。
You can try rolling the window and count the values in window are same by checking the Series.value_counts
length您可以尝试滚动窗口并通过检查Series.value_counts
长度来计算窗口中的值是否相同
final_df['pattern'] = (final_df['Candle'].rolling(3).apply(lambda s: (len(s.value_counts()) == 1))
.fillna(0)
.astype(int))
print(final_df)
Open Close Candle pattern
0 100 101 1 0
1 100 101 1 0
2 101 102 1 1
3 102 101 0 0
4 101 100 0 0
5 100 99 0 1
6 99 98 0 1
7 98 99 1 0
8 99 100 1 0
9 100 101 1 1
10 100 99 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.