简体   繁体   English

Pandas 非连续数字过滤器丢弃 0 行

[英]Pandas non-consecutive number filter dropping 0 row

I am trying to filter a dataset in Pandas.我正在尝试过滤 Pandas 中的数据集。 The number must always increase, although this can be in irregular steps.数量必须始终增加,尽管这可以是不规则的步骤。 I have set up a filter to ensure any values that are smaller than their predecessor are removed from the DataFrame.我已经设置了一个过滤器,以确保从 DataFrame 中删除任何小于其前任的值。 This is a simple example I am working with:这是我正在使用的一个简单示例:

test = {"Test": [1, 3, 5, 7, 9, 2, 11, 4, 13]}
df = pd.DataFrame(test)
df = df[df.Test.shift() + 1 < df.Test]

This works, with the exception that it is also dropping 0 index.这是有效的,除了它还会删除0索引。 ie the output:即输出:

    Test
1   3
2   5
3   7
4   9
6   11
8   13

is missing row 0 1缺少第0 10 1

Any ideas how to get this row in as well?任何想法如何进入这一行?

Try fillna with a value that would make the condition true:尝试使用使条件为真的值来fillna

df = df[df.Test.shift().fillna(df.Test - 1) < df.Test]

df : df

   Test
0     1
1     3
2     5
3     7
4     9
6    11
8    13

Sample DataFrame that shows intermediate steps:显示中间步骤的示例数据帧:

pd.DataFrame({
    'shifted': df.Test.shift(),
    'test': df.Test,
    'condition': df.Test.shift() < df.Test,
    'shifted then filled': df.Test.shift().fillna(df.Test - 1),
    'fixed condition': df.Test.shift().fillna(df.Test - 1) < df.Test
})
 shifted  test  condition  shifted then filled  fixed condition
     NaN     1      False                  0.0             True
     1.0     3       True                  1.0             True
     3.0     5       True                  3.0             True
     5.0     7       True                  5.0             True
     7.0     9       True                  7.0             True
     9.0     2      False                  9.0            False
     2.0    11       True                  2.0             True
    11.0     4      False                 11.0            False
     4.0    13       True                  4.0             True

This issue is that in the first case, NaN is not less than 1 ( NaN < 1 => False ).这个问题是在第一种情况下, NaN不小于 1 ( NaN < 1 => False )。

So try with所以尝试

df[~(df.Test.diff()<0)]
   Test
0     1
1     3
2     5
3     7
4     9
6    11
8    13

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM