[英]How to drop certain rows from dataframe if they partially meet certain condition
I'm trying to drop rows from dataframe if they 'partially' meet certain condition.如果它们“部分”满足某些条件,我正在尝试从 dataframe 中删除行。
By 'partially' I mean some (not all) values in the cell meet the condition. “部分”是指单元格中的某些(不是全部)值满足条件。
Lets' say that I have this dataframe.假设我有这个 dataframe。
>>> df
Title Body
0 Monday report: Stock market You should consider buying this.
1 Tuesday report: Equity XX happened.
2 Corrections and clarifications I'm sorry.
3 Today's top news Yes, it skyrocketed as I predicted.
I want to remove the entire row if the Title has "Monday report:" or "Tuesday report:".如果标题有“星期一报告:”或“星期二报告:”,我想删除整行。
One thing to note is that I used需要注意的一件事是我使用
TITLE = []
.... several lines of codes to crawl the titles.
TITLE.append(headline)
to crawl and store them into dataframe.抓取并将它们存储到 dataframe 中。
Another thing is that my data are in tuples because I used另一件事是我的数据在元组中,因为我使用了
df = pd.DataFrame(list(zip(TITLE, BODY)), columns =['Title', 'Body'])
to make the dataframe.制作 dataframe。
I think that's why when I used,我想这就是为什么当我使用时,
df.query("'Title'.str.contains('Monday report:')")
I got an error.我有一个错误。
When I did some googling here in StackOverflow, some advised to convert tuples into multi-index and to use filter()
, drop()
, or isin()
.当我在 StackOverflow 中进行谷歌搜索时,有人建议将元组转换为多索引并使用
filter()
、 drop()
或isin()
。
None of them worked.他们都没有工作。
Or maybe I used them in a wrong way...?或者也许我以错误的方式使用它们......?
Any idea to solve this prob?有什么想法可以解决这个问题吗?
you can do a basic filter for a condition and then pick reverse of it using ~
:您可以对条件进行基本过滤,然后使用
~
选择它的反向:
eg: df[~df['Title'].str.contains('Monday report')]
will give you output that excludes all rows that contain 'Monday report' in title.例如:
df[~df['Title'].str.contains('Monday report')]
将为您提供 output ,其中不包括标题中包含“Monday report”的所有行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.