简体   繁体   English

检查字符串是否包含一组字符串中的任何项?

[英]Check if a string contains any items of an set of strings?

I have a text file that has a sentence at each line. 我有一个文本文件,每一行都有一个句子。 And I have a word list. 我有一个单词表。 I just want to get only the sentences which contain at least one word from the list. 我只想从列表中仅获取包含至少一个单词的句子。 Is there a pythonic way to do that? 有pythonic的方法可以做到吗?

sentences = [line for line in f if any(word in line for word in word_list)]

这里f是您的文件对象,例如,如果file.txt是文件名并且与脚本位于同一目录中,则可以将其替换为open('file.txt')

Using set.intersection : 使用set.intersection

with open('file') as f:
    [line for line in f if set(line.lower().split()).itersection(word_set)]

or with filter : 或使用filter

filter(lambda x:word_set.intersection(set(x.lower().split())),f)

this will give you a start: 这将为您提供一个开始:

words = ['a', 'and', 'foo']
infile = open('myfile.txt', 'r')
match_sentences = []

for line in infile.readlines():
    # check for words in this line
    # if match, append to match_sentences list

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM