[英]How can I check if more two strings have n words in common (Python)?
I have two string looking (for example) like this:我有两个字符串看起来(例如)是这样的:
string1 = "Password age (minimum) is set to 60"
string2 = "age of password is set to 60 or less!"
(In my task there are way more string ofc but these to are just to show you my problem) (在我的任务中有更多的字符串,但这些只是为了向您展示我的问题)
So now I want to compare if those 2 string >= 3 words in common.所以现在我想比较这 2 个字符串 >= 3 个单词是否相同。
So the output would be something like this:所以输出将是这样的:
string1 & string2 have 3 or more words in common:
["Password", "is", "set", "to", "age", "60"]
A solution could look like this解决方案可能如下所示
set(string1.split()).intersection(string2.split())
which gives you a set of words that occur in both strings.它为您提供了一组出现在两个字符串中的单词。 Keep in mind that this approach is fairly naive, in that it assumes that all words are separated by spaces (which is not the case for words like don't and also for punctuation).
请记住,这种方法相当天真,因为它假设所有单词都用空格分隔(对于像don't这样的单词和标点符号来说情况并非如此)。 You might want to look into a more sophisticated tokenizer to achieve more accurate results (eg NLTK ).
您可能想要研究更复杂的标记器以获得更准确的结果(例如NLTK )。
string1 = "Password age (minimum) is set to 60".split()
string2 = "age of password is set to 60 or less!".split()
newName = ""
for letter in string1:
if letter in string2:
newName = letter
print(newName)
If want to see remaining item then use no in instead of in"""如果想查看剩余的项目,请使用 no in 而不是 in"""
My solution for this would be to use sets, and the &
intersection operator.我对此的解决方案是使用集合和
&
交集运算符。 However, there's many ways this could be done.但是,有很多方法可以做到这一点。
def countShared(s1, s2):
# set the strings to lowercase
s1 = s1.lower()
s2 = s2.lower()
# convert the string to a list, split by the space character
s1List = s1.split(" ")
s2List = s2.split(" ")
# we can get the shared words by converting the lists to sets and
# use the & operator, which gets the 'intersection' (aka the shared items) from both sets
sharedWords = set(s1List) & set(s2List)
# to make it easier to use, we turn it back into a list
return list(sharedWords)
For your example, you would then run it like this:对于您的示例,您将像这样运行它:
string1 = "Password age (minimum) is set to 60"
string2 = "age of password is set to 60 or less!"
outWords = countShared(string1, string2)
print("string1 & string2 have " + str(len(outWords)) + " words in common:")
print(", ".join(outWords))
This gives the output:这给出了输出:
string1 & string2 have 6 words in common:
age, set, 60, to, password, is
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.