简体   繁体   English

Pandas DataFrame 基于依赖于具有累积计数规则的其他列的逻辑创建新列

[英]Pandas DataFrame create new columns based on a logic dependent on other columns with cumulative counting rule

I have a DataFrame originally as follows:我有一个 DataFrame 原来如下:

d1={'on':[0,1,0,1,0,0,0,1,0,0,0],'off':[0,0,0,0,0,0,1,0,1,0,1]}

原来的

My end objective is to add a new column 'final' where it will show a value of '1' once an 'on' indicator' is triggered (ignoring any duplicate) but then 'final' is switched back to '0' if the 'off' indicator is triggered AND ONLY when the 'on' sign was triggered for 3 rows.我的最终目标是添加一个新列'final',一旦触发'on'指示器'(忽略任何重复),它将显示值'1',但如果'final'切换回'0' 'off' 指示器被触发,并且仅当 'on' 标志被触发 3 行时。 I did try coming up with any code but failed to tackle it at all.我确实尝试提出任何代码,但根本没有解决它。

My desired output is as follows:我想要的 output 如下:

期望的

Column 'final' is first triggered in row 1 when the 'on' indicator is switched to 1. 'on' indictor in row 3 is ignored as it is just a redundant signal.当“on”指示符切换到 1 时,第 1 行中的“final”列首先被触发。第 3 行中的“on”指示符被忽略,因为它只是一个冗余信号。 'off' indictor at row 6 is triggered and the 'final' value is switched back to 0 because it has been turned on for more than 3 rows already, unlike the case in row 8 where the 'off' indicator is triggered but the 'final' value cannot be switched off until encountering another 'off' indicator in row 10 because that was the time when the 'final' value has been switched off for > 3 rows.第 6 行的“关闭”指示符被触发,并且“最终”值切换回 0,因为它已经打开超过 3 行,这与第 8 行中触发“关闭”指示符但“在第 10 行遇到另一个“关闭”指示符之前,无法关闭最终值,因为那是关闭超过 3 行的“最终”值的时间。

Thank you for assisting.谢谢你的帮助。 Appreciate.欣赏。

One solution using a "state machine" implemented with yield :使用通过yield实现的“状态机”的一种解决方案:

def state_machine():
    on, off = yield
    cnt, current = 0, on
    while True:
        current = int(on or current)
        cnt += current

        if off and cnt > 3:
            cnt = 0
            current = 0

        on, off = yield current


machine = state_machine()
next(machine)

df = pd.DataFrame(d1)
df['final'] = df.apply(lambda x: machine.send((x['on'], x['off'])), axis=1)

print(df)

Prints:印刷:

    on  off  final
0    0    0      0
1    1    0      1
2    0    0      1
3    1    0      1
4    0    0      1
5    0    0      1
6    0    1      0
7    1    0      1
8    0    1      1
9    0    0      1
10   0    1      0
import pandas as pd

d1 = {'on': [0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0], 'off': [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1]}
df = pd.DataFrame(d1)
df['final'], status, hook = 0, 0, 0

for index, row in df.iterrows():
    hook = index if row['on'] else hook
    row['final'] = status = int((row['on'] or status) and (not (row['off'] and index - hook > 2)))
print(df)

Output: Output:

         on  off  final
    0    0    0      0
    1    1    0      1
    2    0    0      1
    3    1    0      1
    4    0    0      1
    5    0    0      1
    6    0    1      0
    7    1    0      1
    8    0    1      1
    9    0    0      1
    10   0    1      0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM