简体   繁体   English

Python-检查字符串是否包含列表中的任何元素

[英]Python - check if string contains any element from a list

I need to check whether a string contains any element of a list. 我需要检查字符串是否包含列表的任何元素。 I'm currently using this method: 我目前正在使用这种方法:

engWords = ["the", "a", "and", "of", "be", "that", "have", "it", "for", "not"]
engSentence = "the dogs fur is black and white"

print("the english sentence is: " + engSentence)

engWords2 = []
isEnglish = 0

for w in engWords:
    if w in engSentence:
        isEnglish = 1
        engWords2.append(w)

if isEnglish == 1:
    print("The sentence is english and contains the words: ")
    print(engWords2)

The problem with this is that it gives the output: 问题在于它提供了输出:

the english sentence is: the dogs fur is black and white
The sentence is english and contains the words: 
['the', 'a', 'and', 'it']
>>> 

As you can see 'a' and 'it' should not be present. 如您所见,不应出现“ a”和“ it”。 How can i search so that it will only list individual words, rather than parts of a word also? 我如何搜索,使其仅列出单个单词,而不列出单词的一部分? I'm open to any ideas using normal python code, or regex(although I'm very new to both python and regex, so please nothing too complicated) Thank you. 我愿意接受使用常规python代码或regex的任何想法(尽管我对python和regex都非常陌生,所以请不要太复杂)谢谢。

It's finding those two words because they're substrings of "black" and "white" respectively. 之所以找到这两个词,是因为它们分别是“ black”和“ white”的子字符串。 When you apply "in" to a string, it just looks for substrings of characters. 将“ in”应用于字符串时,它仅查找字符的子字符串。

try: 尝试:

engSentenceWords = engSentence.split()

And later, 然后,

if w in engSentenceWords:

That splits the original sentence into a list of individual words, and then checks against whole word values. 这会将原始句子分成单个单词的列表,然后对照整个单词的值进行检查。

words = set(engSentence.split()).intersection(set(engWords))
if words:
    print("The sentence is english and contains the words: ")
    print(words)

Split the engSentence into tokens in a list, convert that to a set, convert engWords to a set, and find the intersection (common overlap). 将engSentence拆分为列表中的标记,将其转换为集合,将engWords转换为集合,然后找到交集(常见重叠)。 Then check to see if this is non-empty, and if so print out the words found. 然后检查这是否为非空,如果是,则将找到的单词打印出来。

or even simpler, add a space to your sentence and your search word: 甚至更简单,请在句子和搜索词中添加空格:

engWords = ["the", "a", "and", "of", "be", "that", "have", "it", "for", "not"]
engSentence = "the dogs fur is black and white"

print("the english sentence is: " + engSentence)

engWords2 = []
isEnglish = 0
engSentence += " "

for w in engWords:
    if "%s " % w in engSentence:
        isEnglish = 1
        engWords2.append(w)

if isEnglish == 1:
    print("The sentence is english and contains the words: ")
    print(engWords2)

output is: 输出为:

the english sentence is: the dogs fur is black and white
The sentence is english and contains the words: 
['the', 'and']

You may want to use regex matching. 您可能要使用正则表达式匹配。 Try something like following 尝试以下操作

import re

match_list = ['foo', 'bar', 'eggs', 'lamp', 'owls']
match_str = 'owls are not what they seem'
match_regex = re.compile('^.*({1}).*$'.format('|'.join(match_list)))

if match_regex.match(match_str):
    print('We have a match.')

See the re documentation on python.org for details. 有关详细信息,请参见python.org上的re文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM