简体   繁体   中英

Pandas non-consecutive number filter dropping 0 row

I am trying to filter a dataset in Pandas. The number must always increase, although this can be in irregular steps. I have set up a filter to ensure any values that are smaller than their predecessor are removed from the DataFrame. This is a simple example I am working with:

test = {"Test": [1, 3, 5, 7, 9, 2, 11, 4, 13]}
df = pd.DataFrame(test)
df = df[df.Test.shift() + 1 < df.Test]

This works, with the exception that it is also dropping 0 index. ie the output:

    Test
1   3
2   5
3   7
4   9
6   11
8   13

is missing row 0 1

Any ideas how to get this row in as well?

Try fillna with a value that would make the condition true:

df = df[df.Test.shift().fillna(df.Test - 1) < df.Test]

df :

   Test
0     1
1     3
2     5
3     7
4     9
6    11
8    13

Sample DataFrame that shows intermediate steps:

pd.DataFrame({
    'shifted': df.Test.shift(),
    'test': df.Test,
    'condition': df.Test.shift() < df.Test,
    'shifted then filled': df.Test.shift().fillna(df.Test - 1),
    'fixed condition': df.Test.shift().fillna(df.Test - 1) < df.Test
})
 shifted  test  condition  shifted then filled  fixed condition
     NaN     1      False                  0.0             True
     1.0     3       True                  1.0             True
     3.0     5       True                  3.0             True
     5.0     7       True                  5.0             True
     7.0     9       True                  7.0             True
     9.0     2      False                  9.0            False
     2.0    11       True                  2.0             True
    11.0     4      False                 11.0            False
     4.0    13       True                  4.0             True

This issue is that in the first case, NaN is not less than 1 ( NaN < 1 => False ).

So try with

df[~(df.Test.diff()<0)]
   Test
0     1
1     3
2     5
3     7
4     9
6    11
8    13

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM