[英]Check if a Python dataframe contains string in list
I have a list and a dataframe with one column named Description that looks like this:我有一个列表和一个 dataframe,其中有一列名为 Description,如下所示:
my_list = ['dog','cat','bird'...]
df
| Description |
|three_legged_dog0named1_Charlie|
| catis_mean |
| 1hippo_stepped-on_an_ant |
I want to write a for loop that loops through each row in df and check whether it contains an element in list, if it does, print the element.我想编写一个循环遍历 df 中每一行的 for 循环,并检查它是否包含列表中的元素,如果包含,则打印该元素。
normally I'd use search(), but I don't know how it works with a list.通常我会使用 search(),但我不知道它如何与列表一起使用。 I could write a for loop that captures all the cases but I don't want to do that.
我可以编写一个 for 循环来捕获所有情况,但我不想这样做。 Is there another way around?
还有其他方法吗?
for i in df['Description']:
if i is in my_list:
print('the element that is in i')
else:
print('not in list')
the output should be: output 应该是:
dog
cat
not in list
If want use pandas non loop method for test is used Series.str.findall
with Series.str.join
for all mateched values joined by ,
and last Series.replace
empty strings:如果要使用 pandas 非循环方法进行测试,则使用
Series.str.findall
和Series.str.join
对由 连接的所有匹配值,
最后Series.replace
空字符串:
my_list = ['dog','cat','bird']
df['new'] = (df['Description'].str.findall('|'.join(my_list))
.str.join(',')
.replace('','not in list'))
print (df)
Description new
0 three_legged_dog0named1_Charlie dog
1 catis_mean cat
2 1hippo_stepped-on_an_ant not in list
pd.Series.str.replace
pattern = f'^.*({"|".join(my_list)}).*$'
# Create a mask to rid ourselves of the pesky no matches later
mask = df.Description.str.match(pattern)
# where the magic happens, use `r'\1'` to swap in the thing that matched
df.Description.str.replace(pattern, r'\1', regex=True).where(mask, 'not in list')
0 dog
1 cat
2 not in list
Name: Description, dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.