[英]Pandas Dataframe Keep Row If Column Contains Any Designated Partial String
I have a pandas data frame. 我有一个熊猫数据框。 Below is a sample table.
下面是一个示例表。
Event Text
A something/AWAIT TO SHIP hello
B 13579
C AWAITING SHIP
D 24613
E nan
I want to only keep rows that contain the words "AWAIT TO SHIP" in the Text column or contains the string 13579 or 24613 in Text column. 我只想在“文本”列中保留包含单词“ AWAIT TO SHIP”或在“文本”列中包含字符串13579或24613的行。 Below is my desired table:
下面是我想要的表:
Event Text
A something/AWAIT TO SHIP hello
B 13579
D 24613
Below is the code I tried: 下面是我尝试的代码:
df_STH001_2 = df_STH001[df_STH001['Text'].str.contains("AWAIT TO SHIP") == True | df_STH001['Text'].str.contains("13579") == True | df_STH001['Text'].str.contains("24613") == True]
Below is the error I get: 以下是我得到的错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
You should not explicitly check == True
, instead just use the call to contains
. 您不应显式检查
== True
,而应使用对contains
的调用。
Here's your sample: 这是您的示例:
First, we define the sample dataframe: 首先,我们定义样本数据框:
df1 = pd.DataFrame(data=[
('A', 'something/AWAIT TO SHIP hello'),
('B', 13579),
('C', 'AWAITING SHIP'),
('D', 24613),
('E', np.nan)], columns=['Event', 'Text'])
Then I build an intermediate mask with your conditions: 然后,我根据您的情况构建一个中间蒙版:
In [18]: mask = df1.Text.str.contains('AWAIT TO SHIP') | \
df1.Text.str.contains('13579') | \
df1.Text.str.contains('24613')
Now you can index the original dataframe using this mask. 现在,您可以使用此掩码为原始数据帧编制索引。
In [19]: df1.loc[mask]
Out[19]:
Event Text
0 A something/AWAIT TO SHIP hello
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.