基于一系列熊猫过滤行

Question

I've been facing a problem to filter out values in a column.我一直面临过滤列中的值的问题。 I have a dataframe (data) which looks like the one below.我有一个如下所示的数据框（数据）。

 Index                                                            Value
2019-11-22 00:00:00                                                0.0  
2019-11-22 00:05:00                                                1.0  
2019-11-22 00:10:00                                                2.0  
2019-11-22 00:15:00                                                3.0  
2019-11-22 00:20:00                                                4.0  
2019-11-22 00:25:00                                                5.0  
2019-11-22 00:30:00                                                6.0  
2019-11-22 00:35:00                                                7.0  
2019-11-22 00:40:00                                                8.0  
2019-11-22 00:45:00                                                0.0  
2019-11-22 00:50:00                                                0.0  
2019-11-22 00:55:00                                                1.0  
2019-11-22 01:00:00                                                2.0  
2019-11-22 01:05:00                                                3.0  
2019-11-22 01:10:00                                                4.0  
2019-11-22 01:15:00                                                5.0

I want to keep the series of values which go above 5 and want to assign all others as zero.我想保留超过 5 的一系列值，并希望将所有其他值分配为零。 For example, if the values are from 1-5, all the previous values before 5 should be set to zero and if there are eight rows with values from 1-8, the code should keep them as it is.The final output should be the following.例如，如果值是 1-5，则 5 之前的所有值都应设置为零，如果有 8 行的值是 1-8，则代码应保持原样。最终输出应为下列。

 Index                                                            Value
2019-11-22 00:00:00                                                0.0  
2019-11-22 00:05:00                                                1.0  
2019-11-22 00:10:00                                                2.0  
2019-11-22 00:15:00                                                3.0  
2019-11-22 00:20:00                                                4.0  
2019-11-22 00:25:00                                                5.0  
2019-11-22 00:30:00                                                6.0  
2019-11-22 00:35:00                                                7.0  
2019-11-22 00:40:00                                                8.0  
2019-11-22 00:45:00                                                0.0  
2019-11-22 00:50:00                                                0.0  
2019-11-22 00:55:00                                                0.0  
2019-11-22 01:00:00                                                0.0  
2019-11-22 01:05:00                                                0.0  
2019-11-22 01:10:00                                                0.0  
2019-11-22 01:15:00                                                0.0

When I try当我尝试

    data[data<5]=0

It just returns the values higher than 5. Any help will be great on this.它只返回高于 5 的值。任何帮助都会很好。

Answer 1

Let's try this:让我们试试这个：

df = pd.read_clipboard(index_col=0, sep='\s\s+')

df.index = pd.to_datetime(df.index)

grp = df['Value'].diff().lt(0).cumsum()

df_out = df.where(df.groupby(grp)['Value'].transform('max').gt(5), 0)
print(df_out)

Output:输出：

                     Value
Index                     
2019-11-22 00:00:00    0.0
2019-11-22 00:05:00    1.0
2019-11-22 00:10:00    2.0
2019-11-22 00:15:00    3.0
2019-11-22 00:20:00    4.0
2019-11-22 00:25:00    5.0
2019-11-22 00:30:00    6.0
2019-11-22 00:35:00    7.0
2019-11-22 00:40:00    8.0
2019-11-22 00:45:00    0.0
2019-11-22 00:50:00    0.0
2019-11-22 00:55:00    0.0
2019-11-22 01:00:00    0.0
2019-11-22 01:05:00    0.0
2019-11-22 01:10:00    0.0
2019-11-22 01:15:00    0.0

Answer 2

Try this:尝试这个：

filter = data["Value"].where(data["Value"] > 5, 0)
indices_with_6 = filter[filter == 6].index
for idx in indices_with_6:
    filter[idx - 5: idx] = [1., 2., 3., 4., 5.]
print(filter)

0     0
1     1
2     2
3     3
4     4
5     5
6     6
7     7
8     8
9     0
10    0
11    0
12    0
13    0
14    0
15    0
Name: Value, dtype: int64

基于一系列熊猫过滤行

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-12-31 00:51:51

解决方案2
0 2019-12-31 00:36:36

基于一系列熊猫过滤行

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-12-31 00:51:51

解决方案2 0 2019-12-31 00:36:36

解决方案1
1 已采纳 2019-12-31 00:51:51

解决方案2
0 2019-12-31 00:36:36