[英]How can I drop several rows from my Dataframe?
I have a dataframe (called my_df1) and want to drop several rows based on certain dates.我有一个数据框(称为 my_df1)并且想根据特定日期删除几行。 How can I create a new dataframe (my_df2) without the dates '2020-05-01' and '2020-05-04'?
如何创建没有日期“2020-05-01”和“2020-05-04”的新数据框(my_df2)?
I tried the following which did not work as you can see below:我尝试了以下不起作用,如下所示:
my_df2 = mydf_1[(mydf_1['Date'] != '2020-05-01') | (mydf_1['Date'] != '2020-05-04')]
my_df2.head()
The problem seems to be with your logical operator.问题似乎出在您的逻辑运算符上。 You should be using
and
here instead of or
since you have to select all the rows which are not 2020-05-01
and 2020-05-04
.您应该使用
and
here 而不是or
因为您必须选择所有不是2020-05-01
和2020-05-04
的行。
The bitwise operators will not be short circuiting and hence the result.按位运算符不会短路,因此结果。
The short explanation about your mistake AND and OR was addressed by kanmaytacker.关于你的错误AND和OR的简短解释由 kanmaytacker 解决。 Following a few additional recommendations:
以下是一些额外的建议:
By label .loc
按标签
.loc
By index .iloc
按索引
.iloc
By label also works without .loc
but it's slower as it's composed of chained operations instead of a single internal operation consisting on nested loops (see here ).按标签也可以在没有
.loc
情况下工作,但速度较慢,因为它由链式操作组成,而不是由嵌套循环组成的单个内部操作(请参阅此处)。 Also, with .loc
you can select on more than one axis at a time.此外,使用
.loc
您可以一次在多个轴上进行选择。
# example with rows. Same logic for columns or additional axis.
df.loc[(df['a']!=4) & (df['a']!=1),:] # ".loc" is the only addition
>>>
a b c
2 0 4 6
Your index is a boolean set.您的索引是一个布尔集。 This is true for numpy and as a consecuence, pandas too.
这对于numpy 来说是正确的,作为一个结果,熊猫也是如此。
(df['a']!=4) & (df['a']!=1)
>>>
0 False
1 False
2 True
Name: a, dtype: bool
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.