如何匹配两个近似单词的列表

Question

I have parent list below我有下面的父母名单

parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE'] parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE']

The incoming input is sentence = The use is asking for AWS and GCP传入的输入是句子 = The use is asking for AWS and GCP

I need to check the incoming input with parent_list and put in the list我需要使用 parent_list 检查传入的输入并放入列表

Expected out is [AWS, GCP]预期结果是[AWS, GCP]

My code is below which is working fine我的代码在下面，它工作正常

[i for i in parent_list if i in sentence ]

Now I need to do some approximate match let's say if sentence = The use is asking for AliBab and gcp现在我需要做一些近似匹配让我们说 if sentence = The use is asking for AliBab and gcp

You can see that AliBab is approximate to ALIBABA可以看到AliBab近似于ALIBABA

Expected out is ['ALIBABA', 'GCP']预期结果是['ALIBABA', 'GCP']

Answer 1

Try might be this:尝试可能是这样的：

types = ['AWS', 'GCP', 'ALIBABA', 'AZURE']
sentence = 'The use is asking for AW and GCP or something'

result = []
for word in sentence.split():
    for t in types:
        if word.lower() in t.lower() or t.lower() in word.lower():
            result.append(t)

print(result)

or with list comprehension:或列表理解：

result = [t for word in sentence.split()
           for t in types
           if word.lower() in t.lower() or t.lower() in word.lower()]

it looks cleaner, but bit complicated它看起来更干净，但有点复杂

for more than 1 delimeter, use:对于超过 1 个分隔符，请使用：

import re
for word in re.split(' |,', sentence):

like:喜欢：

result = [t for word in re.split(' |,', sentence)
           for t in types
           if word.lower() in t.lower() or t.lower() in word.lower()]

about adding delimiter, ',' is different one from ', '关于添加分隔符，'，'与'，'不同

Answer 2

Depends on the definition of approximation match.取决于近似匹配的定义。

If substring is a criteria then you can iterate over the words of the sentence and parent list and return matches if the word of the sentence appear as a substring of the element of the parent list.如果substring是一个条件，那么您可以遍历句子的单词和父列表，如果句子的单词显示为父列表元素的 substring，则返回匹配项。

matches = [elt for elt in parent_list if any(word.lower() in elt.lower() for word in sentence.split())]

You can use re.split() to split on multiple delimiters:您可以使用re.split()在多个分隔符上拆分：

parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE']
sentence = "The use is asking for AliBab and gcp"
import re
matches = [elt for elt in parent_list if any(word.lower() in elt.lower() or elt.lower() in word.lower() for word in re.split('[, ]', sentence))]
print(matches)

sentence = "The use is asking for AWS,GCP"
matches = [elt for elt in parent_list if any(word.lower() in elt.lower() or elt.lower() in word.lower() for word in re.split('[, ]', sentence))]
print(matches)

Answer 3

You can do this:你可以这样做：

parent_list = ['AWS', 'GCP', 'ALIBABA', 'AZURE']
used_words = []
string = "The use is asking for AWS and GCP"
for word in parent_list:
    if(word.lower() in string.lower()):
        used_words.append(word)

print(used_words)

如何匹配两个近似单词的列表

问题描述

3 个解决方案

解决方案1
1 2021-03-22 09:39:46

解决方案2
1 已采纳 2021-03-22 09:42:51

解决方案3
1 2021-03-22 09:42:52

如何匹配两个近似单词的列表

问题描述

3 个解决方案

解决方案1 1 2021-03-22 09:39:46

解决方案2 1 已采纳 2021-03-22 09:42:51

解决方案3 1 2021-03-22 09:42:52

解决方案1
1 2021-03-22 09:39:46

解决方案2
1 已采纳 2021-03-22 09:42:51

解决方案3
1 2021-03-22 09:42:52