简体   繁体   English

如何根据条件删除数据框行

[英]How to delete dataframe rows based on a condition

Here is the dataframe (csv):这是数据框(csv):

Time                 Longitude      Latitude    SSHA
11/22/2013 8:57     -123.603607     81.377536   0.348
11/22/2013 8:57     -124.017502     81.387791   0.386
11/22/2013 8:57     -124.432344     81.397611   0.383
11/22/2013 8:57     -124.848099     81.406995   0.405
11/22/2013 8:57     -125.264724     81.415942   --
...                  ...            ...         ...

I want to eliminate all rows with Longitude less than 0 and greater than 40. However, when I enter my script, it doesn't work.我想消除经度小于 0 且大于 40 的所有行。但是,当我输入脚本时,它不起作用。

import pandas as pd
import numpy
df =pd.read_csv(r"C:\\Users\\chz08006\\Documents\\Results1\\BlockIsland25Test.csv")

indexNames=df[(df['Longitude'] <= 0) & (df['Longitude']>=40)].index
df.drop(indexNames,inplace=True)
df

If I just enter如果我只是输入

indexNames=df[(df['Longitude'] <= 0)].index
df.drop(indexNames,inplace=True)
df

it works fine.它工作正常。 However, when I add & (df['Longitude']>=40) , nothing changes to the dataframe!但是,当我添加& (df['Longitude']>=40) ,数据框没有任何变化! I don't even recieve an error.我什至没有收到错误。

As i understand the way you are thinking that's not gona help you until there is an elegant way of doing.据我了解,除非有一种优雅的做法,否则您的想法不会对您有所帮助。

Just a Example Dataset:只是一个示例数据集:

>>> import pandas as pd
>>> import numpy as np
>>> df
   col1 col2
0   123    A
1   123    A
2   456    B
3   456    A
4   -12    C
5    41    D

So, if you go and test with even np.bitwise_and operator you can see the bool returning False.因此,如果您使用np.bitwise_and运算符进行测试,您可以看到 bool 返回 False。 So, thi is not going to help...所以,这不会有帮助......

>>> np.bitwise_and(df.col1 <= 0, df.col1 >= 40)
0    False
1    False
2    False
3    False
4    False
5    False
Name: col1, dtype: bool

>>> df[np.bitwise_and(df.col1 <= 0, df.col1 >= 40)]
Empty DataFrame
Columns: [col1, col2]
Index: []

Only solution as follows as a workaround until you see a perfect way.在您看到完美的方法之前,只能解决以下问题作为解决方法。

>>> df.drop(df.loc[df['col1'] <=0].index, inplace=True)
>>> df.drop(df.loc[df['col1'] >= 123].index, inplace=True)
>>> df
   col1 col2
5    41    D

Hope, this will help.希望,这会有所帮助。

As stated by @Joseph Crispell, the primary problem here is the that x<=0 & x>=40 will always return False.正如@Joseph Crispell 所说,这里的主要问题是x<=0 & x>=40将始终返回 False。 So, when indexNames is instantiated, it is empty, therefore there are no indices to drop.因此,当 indexNames 被实例化时,它是空的,因此没有要删除的索引。

As @Erfran pointed out, it may be easier to use df[(df['Longitude'] <= 0) |(df['Longitude']>=40)] .正如@Erfran 指出的那样,使用df[(df['Longitude'] <= 0) |(df['Longitude']>=40)]可能更容易。

But it is unclear whether you meant to say |, or if you are looking for a value between 0 and 40 (inclusive), in which case the code would look like: df[(0 <= df['Longitude'] <= 40)]但不清楚您是想说 |,还是要查找 0 到 40(含)之间的值,在这种情况下,代码将如下所示: df[(0 <= df['Longitude'] <= 40)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM