[英]Python match at least 3 words in a set
I have str with a phrase, for example:我有一个短语 str,例如:
phrase = "My cat has two eyes and like to catch rats"
And I have a set of words
that I would like to matche at least 3 of theses words in the phrase.我有一组
words
,我想匹配短语中至少 3 个这些单词。
words = set(["eyes", "like", "cat"])
Currently I have the following code目前我有以下代码
found = bool(set(phrase.lower().split()) & words)
But it matches if any of the words are in the phrase, and I want at least 3 words matching.但如果短语中有任何单词,它就会匹配,并且我想要至少 3 个单词匹配。
What I can do to achieve this?我能做些什么来实现这一目标? I don't want to use
regex
.我不想使用
regex
。
You can check if the length of the intersection is at least 3.您可以检查交叉点的长度是否至少为 3。
found = len(set(phrase.lower().split()).intersection(words)) >= 3
You can do something like the following:您可以执行以下操作:
from typing import Set
def words_matcher(phrase: str, words: Set[str], threshold: int = 3) -> bool:
phrase_as_set = set(phrase.lower().split())
common_words = phrase_as_set.intersection(words)
return len(common_words) >= threshold
You are almost there.你快到了。
&
performs intersection on the set
objects. &
对set
对象进行交集。 But instead of doing bool
, you need to get the length
and check whether it is >=3.但不是做
bool
,您需要获取length
并检查它是否 >=3。 Hence use this:因此使用这个:
>>> phrase = "My cat has two eyes and like to catch rats"
>>> words = set(["eyes", "like", "cat"])
>>> len(set(phrase.lower().split()) & words) >= 3
True
If you want to check if all words in your set appear in phrase you might check if it is subset ie如果你想检查你的集合中的所有单词是否都出现在短语中,你可以检查它是否是子集,即
phrase = "My cat has two eyes and like to catch rats"
words = set(["eyes", "like", "cat"])
print(words.issubset(phrase.lower().split())) # True
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.