简体   繁体   English

Python 匹配至少 3 个单词

[英]Python match at least 3 words in a set

I have str with a phrase, for example:我有一个短语 str,例如:

phrase = "My cat has two eyes and like to catch rats"

And I have a set of words that I would like to matche at least 3 of theses words in the phrase.我有一组words ,我想匹配短语中至少 3 个这些单词。

words = set(["eyes", "like", "cat"])

Currently I have the following code目前我有以下代码

found = bool(set(phrase.lower().split()) & words)

But it matches if any of the words are in the phrase, and I want at least 3 words matching.但如果短语中有任何单词,它就会匹配,并且我想要至少 3 个单词匹配。

What I can do to achieve this?我能做些什么来实现这一目标? I don't want to use regex .我不想使用regex

You can check if the length of the intersection is at least 3.您可以检查交叉点的长度是否至少为 3。

found = len(set(phrase.lower().split()).intersection(words)) >= 3

You can do something like the following:您可以执行以下操作:

from typing import Set


def words_matcher(phrase: str, words: Set[str], threshold: int = 3) -> bool:
    phrase_as_set = set(phrase.lower().split())
    common_words = phrase_as_set.intersection(words)
    return len(common_words) >= threshold

You are almost there.你快到了。 & performs intersection on the set objects. &set对象进行交集。 But instead of doing bool , you need to get the length and check whether it is >=3.但不是做bool ,您需要获取length并检查它是否 >=3。 Hence use this:因此使用这个:

>>> phrase = "My cat has two eyes and like to catch rats"
>>> words = set(["eyes", "like", "cat"])

>>> len(set(phrase.lower().split()) & words) >= 3
True

If you want to check if all words in your set appear in phrase you might check if it is subset ie如果你想检查你的集合中的所有单词是否都出现在短语中,你可以检查它是否是子集,即

phrase = "My cat has two eyes and like to catch rats"
words = set(["eyes", "like", "cat"])
print(words.issubset(phrase.lower().split()))  # True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM