Python Pandas DataFrame：基于其他列值的条件列

Question

Description of the problem:问题描述：
I'am trying to simulate a machine whose operation mode "B" occurs if "VALUE" is greater or equal to 5 in the last 3 previous time steps- which means "VALUE">= 5 for at least 3 minutes.The Operation mode "B" keeps to be "B" for the next time steps as long as "VALUE" is greater or equal to 5 and is turned to "A" after at least 3 time steps - which means The Operation mode "B" keeps valid for at the next 3 minutes.我正在尝试模拟一台机器，如果在前 3 个时间步长中“VALUE”大于或等于 5，则其运行模式为“B”——这意味着“VALUE”>= 5 至少 3 分钟。运行模式只要“VALUE”大于或等于 5，“B”在接下来的时间步长中保持为“B”，并且在至少 3 个时间步长后变为“A” - 这意味着操作模式“B”保持有效在接下来的 3 分钟内。 After 3 minutes the Operation mode "A" is turned on if "VALUE" is less than 5. 3 分钟后，如果“VALUE”小于 5，操作模式“A”将打开。

The goal:目标：
I need an approach using python and pandas to identify the the operation mode described by "A" and "B" (column: "statusA/B") based on the values in column "VALUE" and the status "on" and "down" (column: "VALUE<5 --> down, VALUE>=5 --> on").我需要一种使用 python 和 pandas 的方法，根据“VALUE”列中的值以及状态“on”和“down”来识别“A”和“B”（列：“statusA/B”）描述的操作模式”（列：“VALUE<5 --> 下降，VALUE>=5 --> 上升”）。

The conditions have to be considered are as follows:必须考虑的条件如下：

case "A" and "B" depend on each other.案例“A”和“B”相互依赖。
case "B" is occurred if at least 3 "on" previously occurred and the actual VALUE is greater or equal to 5.如果先前发生至少 3 个“on”并且实际 VALUE 大于或等于 5，则发生情况“B”。
once "B" occurs, the next 3 time steps have to be "B" even if the status is “down” and it keeps to be "B" as long as "on" exists.一旦“B”发生，即使状态为“down”，接下来的 3 个时间步也必须为“B”，只要“on”存在，它就会保持为“B”。

What I did try :我做了什么尝试：
I tried multiple approaches applying counter for the cases "down" and "on" and tried to track the status based on the counter values but unfortunately it did not work it properly.我尝试了多种方法，将计数器应用于“向下”和“打开”的情况，并尝试根据计数器值跟踪状态，但不幸的是它无法正常工作。

time时间	VALUE价值	VALUE<5 --> down / VALUE>=5 --> on VALUE<5 --> 下 / VALUE>=5 --> 上	statusA/B状态A/B
00:00 00:00	0 0	down向下	A一种
00:01 00:01	0 0	down向下	A一种
00:02 00:02	0 0	down向下	A一种
00:03 00:03	8 8个	on在	A一种
00:04 00:04	4 4个	down向下	A一种
00:05 00:05	2 2个	down向下	A一种
00:06 00:06	1 1个	down向下	A一种
00:07 00:07	2 2个	down向下	A一种
00:08 00:08	1 1个	down向下	A一种
00:08 00:08	5 5个	on在	A一种
00:09 00:09	6 6个	on在	A一种
00:10 00:10	0 0	down向下	A一种
00:11 00:11	10 10	on在	A一种
00:12 00:12	10 10	on在	A一种
00:13 00:13	10 10	on在	A一种
00:14 00:14	11 11	down向下	B乙
00:15 00:15	2 2个	down向下	B乙
00:16 00:16	1 1个	down向下	B乙
00:17 00:17	3 3个	down向下	A一种
00:18 00:18	11 11	on在	A一种
00:19 00:19	10 10	on在	A一种
00:20 00:20	10 10	on在	A一种
00:21 00:21	10 10	on在	B乙
00:22 00:22	10 10	on在	B乙
00:23 00:23	11 11	on在	B乙
00:24 00:24	14 14	on在	B乙
00:25 00:25	11 11	on在	B乙

Answer 1

Modified Solution .修改后的解决方案。 I edited my solution thanks to a subtle point made by dear mozway :由于亲爱的mozway提出的一个微妙的观点，我编辑了我的解决方案：

import pandas as pd

df2['status'] = df2['VALUE'].mask(df2['VALUE'].shift().rolling(3, min_periods=3).min() >= 5, 'B')

m1 = df2['status'].shift().eq('B')
m2 = df2['status'].shift(2).eq('B')


df2['status'] = (df2['status']
                .mask(m1 | m2).fillna('B')
                .astype(str)
                .str.replace(r'\d+', 'A'))

m5 = df2['status'].shift().eq('B')
m6 = df2['status'].shift(2).eq('B')
m3 = df2['status'].eq('A')
m4 = df2.iloc[:, 2].eq('on')

df2['status'] = df2['status'].mask((m5 & m3 & m4) | (m6 & m3 & m4)).fillna('B')


    index  VALUE  ...                                 resulted statusA/B status
0       0      3  ...                                                  A      A
1       1      5  ...                                                  A      A
2       2      2  ...                                                  A      A
3       3      6  ...                                                  A      A
4       4      3  ...                                                  A      A
5       5      1  ...                                                  A      A
6       6      7  ...                                                  A      A
7       7      7  ...                                                  A      A
8       8      2  ...                                                  A      A
9       9      2  ...                                                  A      A
10     10      3  ...                                                  A      A
11     11      6  ...                                                  A      A
12     12      8  ...                                                  A      A
13     13      8  ...                                                  A      A
14     14      7  ...                                                  B      B
15     15      4  ...                                                  B      B
16     16      4  ...                                                  B      B
17     17      6  ...  A(expected is B because is "on" and at least 3...      B
18     18      6  ...  A(expected is B because is "on" and at least 3...      B
19     19      6  ...  A(expected is B because is "on" and at least 3...      B
20     20      7  ...                                                  B      B
21     21      2  ...                                                  B      B
22     22      9  ...                                                  B      B
23     23      8  ...  A(expected is B because "B" keeps a "B" for 3 ...      B
24     24      7  ...  A(expected is B because is "on" and at least 3...      B
25     25      2  ...                                                  B      B
26     26      4  ...  A(expected is B because "B" keeps a "B" for 3 ...      B
27     27      4  ...  A(expected is B because "B" keeps a "B" for 3 ...      B
28     28      1  ...  A(this true because it is down and the 3 time ...      A
29     29      4  ...                                                  A      A
[30 rows x 5 columns]

Python Pandas DataFrame：基于其他列值的条件列

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-11-13 19:40:53

Python Pandas DataFrame：基于其他列值的条件列

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-11-13 19:40:53

解决方案1
1 已采纳 2022-11-13 19:40:53