简体   繁体   English

过滤熊猫数据框中的行,其中列中的值大于 x 或 NaN

[英]Filter for rows in pandas dataframe where values in a column are greater than x or NaN

I'm trying to figure out how to filter a pandas dataframe so that that the values in a certain column are either greater than a certain value, or are NaN.我想弄清楚如何过滤熊猫数据框,以便某个列中的值要么大于某个值,要么是 NaN。 Lets say my dataframe looks like this:假设我的数据框如下所示:

df = pd.DataFrame({"col1":[1, 2, 3, 4], "col2": [4, 5, np.nan, 7]})

I've tried:我试过了:

df = df[df["col2"] >= 5 | df["col2"] == np.nan]

and:和:

df = df[df["col2"] >= 5 | np.isnan(df["col2"])]

But the first causes an error, and the second excludes rows where the value is NaN.但第一个导致错误,第二个排除值为 NaN 的行。 How can I get the result to be this:我怎样才能得到这样的结果:

pd.DataFrame({"col1":[2, 3, 4], "col2":[5, np.nan, 7]})

Please Try请尝试

df[df.col2.isna()|df.col2.gt(4)]



  col1  col2
1     2   5.0
2     3   NaN
3     4   7.0

此外,您可以使用阈值填充 nan:

df[df.fillna(5)>=5]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM