从Python中的句子中提取空格分隔的单词

Question

我有字符串列表说， x1 = ['esk','wild man','eskimo', 'sta','(+)-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa']我需要在几句话中提取 x1s。

我的句子是"eskimo lives as a wild man in wild jungle and he stands as a guard". 在句子中，我需要提取第一个单词 eskimo 和第七个和第八个单词 wild man，它们是单独的单词，如 x1。 即使 sta 出现在看台中，我也不应该提取“看台”。

def get_name(input_str):

 prod_name= []
    for row in x1:
        if (row.strip().lower()in input_str.lower().strip()) or (len([x for x in input_str.split() if "\b"+x in row])>0):
            prod_name.append(row) 
return list(set(prod_name))

函数get_name("eskimo lives as a wild man in wild jungle and he stands as a guard")返回

[esk, eskimo,wild man,sta]

但预期是

[eskimo,wild man]

我可以知道代码中需要更改的内容吗？

Answer 1

您可以简单地使用 str.split(" ") 获取句子中所有单词的列表，然后执行以下操作：

s = "eskimo lives as a wild man in wild jungle and he stands as a guard"

l = s.split(" ")

x1 = ['esk','wild man','eskimo', 'sta','(+)-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa']
new_x1 = [word.split(" ") for word in x1 if " " in word] + [word for word in x1 if " " not in word]

ans = []

for x in new_x1:
    if type(x) == str:
        if x in l:
            ans.append(x)
    else:
        temp = ""
        for i in x:
            temp += i + " "
        temp = temp[:-1]
        if all(sub_x in l for sub_x in x) and temp in s:
            ans.append(temp)

print(ans)

Answer 2

我有一个稍微不同的方法。 首先，您可以将输入句子拆分为单词，并将要检查的每个短语拆分为组成单词。 然后检查句子中是否存在一个短语的所有单词。

x1 = ['esk','wild man','eskimo', 'sta','(+)-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa']
input_sentence = "eskimo lives as a wild man in wild jungle and he stands as a guard"
# Remove all punctuation marks from the sentence
input_sentence = input_sentence.replace('!', '').replace('.', '').replace('?', '').replace(',', '')
# Split the input sentence into its component words to check individually
input_words = input_sentence.split()

for ele in x1:
    # Split each element in x1 into words
    ele_words = ele.split()
    # Check if all words are part of the input words
    if all(ele in input_words for ele in ele_words) and ele in input_sentence:
        print(ele)

Answer 3

您可以使用正则表达式

import re

x1 = ['esk','wild man','eskimo', 'sta']

my_str = "eskimo lives as a wild man in wild jungle and he stands as a guard"
my_list = []

for words in x1:
    if re.search(r'\b' + words + r'\b', my_str):
        my_list.append(words)
print(my_list)

根据新列表，因为字符串(+)-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa使用正则表达式生成错误，您可以使用try except块

for words in x1:
  try:
    if re.search(r'\b' + words + r'\b', my_str):
      my_list.append(words)
  except:
    pass

Answer 4

您可以在左侧(?<!\S)和右侧(?!\S)使用带有空格边界的正则表达式来不获得部分匹配，并加入x1列表中的所有项目。

然后使用 re.findall 获取所有匹配项：

import re

x1 = ['esk','wild man','eskimo', 'sta','(+)-6-[amina(4-chlora)(1-metha-1h-imidol-5-yl)mhyl]-4-(3-chlora)-1-methyl-2(1h)-quinoa']
s = "eskimo lives as a wild man in wild jungle and he stands as a guard"
pattern = fr"(?<!\S)(?:{'|'.join(re.escape(x) for x in x1)})(?!\S)"

print(re.findall(pattern, s))

输出

['eskimo', 'wild man']

查看Python 演示。

从Python中的句子中提取空格分隔的单词

问题描述

4 个解决方案

解决方案1
2 2022-06-23 11:43:40

解决方案2
2 已采纳 2022-06-23 12:42:01

解决方案3
1 2022-06-23 11:50:36

解决方案4
0 2022-06-23 18:18:18

从Python中的句子中提取空格分隔的单词

问题描述

4 个解决方案

解决方案1 2 2022-06-23 11:43:40

解决方案2 2 已采纳 2022-06-23 12:42:01

解决方案3 1 2022-06-23 11:50:36

解决方案4 0 2022-06-23 18:18:18

解决方案1
2 2022-06-23 11:43:40

解决方案2
2 已采纳 2022-06-23 12:42:01

解决方案3
1 2022-06-23 11:50:36

解决方案4
0 2022-06-23 18:18:18