简体   繁体   中英

Groupby remainder of day to show True if identical consecutive values

Given the following DataFrame. How do I add a new column showing True for the rest of the day when two consecutive "y" are seen in a single day in the val column (else False ).

  • Each day resets the logic.

This is close but the True should be for each row in this day after condition is seen.

Code

df_so = pd.DataFrame(
    {
        "val": list("yynnnyyynn")
    },
    index=pd.date_range(start="1/1/2018", periods=10, freq="6h"),
)

                   val
2018-01-01 00:00:00 y
2018-01-01 06:00:00 y
2018-01-01 12:00:00 n
2018-01-01 18:00:00 n
2018-01-02 00:00:00 n
2018-01-02 06:00:00 y
2018-01-02 12:00:00 y
2018-01-02 18:00:00 y
2018-01-03 00:00:00 n
2018-01-03 06:00:00 n

Desired output

                    val  out
2018-01-01 00:00:00  y   False
2018-01-01 06:00:00  y   False
2018-01-01 12:00:00  n   True
2018-01-01 18:00:00  n   True
2018-01-02 00:00:00  n   False
2018-01-02 06:00:00  y   False
2018-01-02 12:00:00  y   False
2018-01-02 18:00:00  y   True
2018-01-03 00:00:00  n   False
2018-01-03 06:00:00  n   False

You can use cummax to check if the condition holds at some point in the past:

target = 2
df_so['out'] = (df_so['val'].eq('y')
                    .groupby(df_so.index.normalize())
                    .transform(lambda x: x.rolling(target).sum().shift().eq(target).cummax())
               )

Output:

                    val    out
2018-01-01 00:00:00   y  False
2018-01-01 06:00:00   y  False
2018-01-01 12:00:00   n   True
2018-01-01 18:00:00   n   True
2018-01-02 00:00:00   n  False
2018-01-02 06:00:00   y  False
2018-01-02 12:00:00   y  False
2018-01-02 18:00:00   y   True
2018-01-03 00:00:00   n  False
2018-01-03 06:00:00   n  False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM