[英]Pandas non-consecutive number filter dropping 0 row
I am trying to filter a dataset in Pandas.我正在尝试过滤 Pandas 中的数据集。 The number must always increase, although this can be in irregular steps.
数量必须始终增加,尽管这可以是不规则的步骤。 I have set up a filter to ensure any values that are smaller than their predecessor are removed from the DataFrame.
我已经设置了一个过滤器,以确保从 DataFrame 中删除任何小于其前任的值。 This is a simple example I am working with:
这是我正在使用的一个简单示例:
test = {"Test": [1, 3, 5, 7, 9, 2, 11, 4, 13]}
df = pd.DataFrame(test)
df = df[df.Test.shift() + 1 < df.Test]
This works, with the exception that it is also dropping 0
index.这是有效的,除了它还会删除
0
索引。 ie the output:即输出:
Test
1 3
2 5
3 7
4 9
6 11
8 13
is missing row 0 1
缺少第
0 1
行0 1
Any ideas how to get this row in as well?任何想法如何进入这一行?
Try fillna
with a value that would make the condition true:尝试使用使条件为真的值来
fillna
:
df = df[df.Test.shift().fillna(df.Test - 1) < df.Test]
df
: df
:
Test
0 1
1 3
2 5
3 7
4 9
6 11
8 13
Sample DataFrame that shows intermediate steps:显示中间步骤的示例数据帧:
pd.DataFrame({
'shifted': df.Test.shift(),
'test': df.Test,
'condition': df.Test.shift() < df.Test,
'shifted then filled': df.Test.shift().fillna(df.Test - 1),
'fixed condition': df.Test.shift().fillna(df.Test - 1) < df.Test
})
shifted test condition shifted then filled fixed condition
NaN 1 False 0.0 True
1.0 3 True 1.0 True
3.0 5 True 3.0 True
5.0 7 True 5.0 True
7.0 9 True 7.0 True
9.0 2 False 9.0 False
2.0 11 True 2.0 True
11.0 4 False 11.0 False
4.0 13 True 4.0 True
This issue is that in the first case, NaN
is not less than 1 ( NaN < 1
=> False
).这个问题是在第一种情况下,
NaN
不小于 1 ( NaN < 1
=> False
)。
So try with所以尝试
df[~(df.Test.diff()<0)]
Test
0 1
1 3
2 5
3 7
4 9
6 11
8 13
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.