[英]Groupby remainder of day to show True if identical consecutive values
Given the following DataFrame.给定以下 DataFrame。 How do I add a new column showing
True
for the rest of the day when two consecutive "y"
are seen in a single day in the val
column (else False
).如何在
val
列中在一天内看到两个连续的"y"
时为当天的 rest添加一个显示True
的新列(否则为False
)。
This is close but the True
should be for each row in this day after condition is seen. 这很接近,但在看到条件后的这一天,每一行都应该是
True
。
Code代码
df_so = pd.DataFrame(
{
"val": list("yynnnyyynn")
},
index=pd.date_range(start="1/1/2018", periods=10, freq="6h"),
)
val
2018-01-01 00:00:00 y
2018-01-01 06:00:00 y
2018-01-01 12:00:00 n
2018-01-01 18:00:00 n
2018-01-02 00:00:00 n
2018-01-02 06:00:00 y
2018-01-02 12:00:00 y
2018-01-02 18:00:00 y
2018-01-03 00:00:00 n
2018-01-03 06:00:00 n
Desired output所需 output
val out
2018-01-01 00:00:00 y False
2018-01-01 06:00:00 y False
2018-01-01 12:00:00 n True
2018-01-01 18:00:00 n True
2018-01-02 00:00:00 n False
2018-01-02 06:00:00 y False
2018-01-02 12:00:00 y False
2018-01-02 18:00:00 y True
2018-01-03 00:00:00 n False
2018-01-03 06:00:00 n False
You can use cummax
to check if the condition holds at some point in the past:您可以使用
cummax
来检查条件是否在过去的某个时间点成立:
target = 2
df_so['out'] = (df_so['val'].eq('y')
.groupby(df_so.index.normalize())
.transform(lambda x: x.rolling(target).sum().shift().eq(target).cummax())
)
Output: Output:
val out
2018-01-01 00:00:00 y False
2018-01-01 06:00:00 y False
2018-01-01 12:00:00 n True
2018-01-01 18:00:00 n True
2018-01-02 00:00:00 n False
2018-01-02 06:00:00 y False
2018-01-02 12:00:00 y False
2018-01-02 18:00:00 y True
2018-01-03 00:00:00 n False
2018-01-03 06:00:00 n False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.