[英]Check if any word in the list present in string python
I have nearly 50k documents in a mongo collection somewhat like this: 我在mongo集合中有将近5万个文档,如下所示:
{"title":"sample title sample title",
"content":"test content test content",
"reply":{
"replyContent":"sample reply content test"
}
}
and I have an array of words something like this: 我有一系列这样的单词:
wordArr = ["sample","test"]
I need to match if any word form wordArr present in my collection of document. 我需要匹配文档集合中是否存在任何单词形式wordArr。 I have to iterate over each document from the collection and have to search if any of the word given in array id present in either of the fields ie title , content and replyContent
我必须遍历集合中的每个文档,并且必须搜索是否在任何字段(即title,content和replyContent)中存在的数组ID中给出的任何单词
The following should work assuming your mongo collection is in a dictionary (sorry I have no experience with mongo collections. 假设您的mongo集合在词典中,以下内容应该可以工作(对不起,我没有mongo集合的经验。
dict = {"title":"sample title sample title",
"content":"test content test content",
"reply":{"replyContent":"sample reply content test"}
}
wordArr = ["sample","test"]
for word in wordArr:
for key, value in dict.iteritems():
if word in value:
print 'Word: `%s` present in `%s`: %s' % (word, key, value)
if key=='reply':
for key2,value2 in value.iteritems():
print 'Word `%s` present in `%s`: %s' % (word, key2, value2)
This will give you the following output: 这将为您提供以下输出:
> python test.py
Word `sample` present in `replyContent`: sample reply content test
Word: `sample` present in `title`: sample title sample title
Word: `test` present in `content`: test content test content
Word `test` present in `replyContent`: sample reply content test
If you just want to return True or False: 如果只想返回True或False:
d = {"title": "sample title sample title",
"content": "test content test content",
"reply": {
"replyContent": "sample reply content test"
}
}
word_set = {"sample", "test"}
def is_present(d, st):
for v in d.values():
if isinstance(v, dict):
for val in d.values():
if any(word in st for s in val for word in s.split()):
return True
else:
if any(word in word_set for word in v.split()):
return True
return False
print(is_present(d,word_set))
If you have arbitrary levels of nesting you might need a nested approach 如果您有任意级别的嵌套,则可能需要嵌套方法
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.