繁体   English   中英

如何匹配两个近似单词的列表

[英]How to match between 2 list which is approximate word

我有下面的父母名单

parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE']

传入的输入是句子 = The use is asking for AWS and GCP

我需要使用 parent_list 检查传入的输入并放入列表

预期结果是[AWS, GCP]

我的代码在下面,它工作正常

[i for i in parent_list if i in sentence ]

现在我需要做一些近似匹配让我们说 if sentence = The use is asking for AliBab and gcp

可以看到AliBab近似于ALIBABA

预期结果是['ALIBABA', 'GCP']

尝试可能是这样的:

types = ['AWS', 'GCP', 'ALIBABA', 'AZURE']
sentence = 'The use is asking for AW and GCP or something'

result = []
for word in sentence.split():
    for t in types:
        if word.lower() in t.lower() or t.lower() in word.lower():
            result.append(t)

print(result)

或列表理解:

result = [t for word in sentence.split()
           for t in types
           if word.lower() in t.lower() or t.lower() in word.lower()]

它看起来更干净,但有点复杂

对于超过 1 个分隔符,请使用:

import re
for word in re.split(' |,', sentence):

喜欢:

result = [t for word in re.split(' |,', sentence)
           for t in types
           if word.lower() in t.lower() or t.lower() in word.lower()]

关于添加分隔符,','与','不同

取决于近似匹配的定义。

如果substring是一个条件,那么您可以遍历句子的单词和父列表,如果句子的单词显示为父列表元素的 substring,则返回匹配项。

matches = [elt for elt in parent_list if any(word.lower() in elt.lower() for word in sentence.split())]

您可以使用re.split()在多个分隔符上拆分:

parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE']
sentence = "The use is asking for AliBab and gcp"
import re
matches = [elt for elt in parent_list if any(word.lower() in elt.lower() or elt.lower() in word.lower() for word in re.split('[, ]', sentence))]
print(matches)

sentence = "The use is asking for AWS,GCP"
matches = [elt for elt in parent_list if any(word.lower() in elt.lower() or elt.lower() in word.lower() for word in re.split('[, ]', sentence))]
print(matches)

你可以这样做:

parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE']
used_words = []
string = "The use is asking for AWS and GCP"
for word in parent_list:
    if(word.lower() in string.lower()):
        used_words.append(word)

print(used_words)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM