简体   繁体   English

使用 Python 检查文本中是否存在不同的单词组合

[英]Check if different word combinations exist in text using Python

I want to write a function that finds certain word combinations in text and tells it belongs to which list.我想写一个 function 在文本中找到某些单词组合并告诉它属于哪个列表。 Example:例子:

my_list1 = ["Peter Parker", "Eddie Brock"]
my_list2 = ["Harry Potter", "Severus Snape", "Dumbledore"]

Example input: "Harry Potter was very sad"
Example output: my_list1

You could iterate over the string and then append the occuring words into a list and then check for which words occur most to determine which list the whole string belongs to:您可以遍历字符串,然后将出现的单词 append 放入列表中,然后检查哪些单词出现最多以确定整个字符串属于哪个列表:

my_list1 = ["Peter Parker", "Eddie Brock"]
my_list2 = ["Harry Potter", "Severus Snape", "Dumbledore"]


to_check = "Harry Potter was very sad"

def which_list(to_check):
    belong_l1 = 0
    belong_l2 = 0    
    for i in to_check:
        if i in my_list1:
            belong_l1 += 1
        elif i in my_list2:
            belong_l2 += 1
    if belong_l1 > belong_l2:
        print("string belongs to list 1")
    elif belong_l1 < belong_l2:
        print("string belongs to list 2")
    else:
        print("belonging couldn't be determined")
        

First, I would include the name in the list.首先,我会在列表中包含名称。

lst = (
          ("list1", ("Peter Parker", "Eddie Brock")),
          ("list2", ("Harry Potter", "Severus Snape", "Dumbledore")),
          ("list3", ("Harry Potter",)),
        )
while True:
  txt = input( "Enter some text: ")
  if len(txt) == 0: break
  for names in lst:
    for name in names[1]:
      if name in txt:
        print( f"'{name}' found in {names[0]}.")

And the result:结果:

Enter some text: Harry Potter was here
'Harry Potter' found in list2.
'Harry Potter' found in list3.
Enter some text: Harry Potter and Dumbledore were here
'Harry Potter' found in list2.
'Dumbledore' found in list2.
'Harry Potter' found in list3.

Assuming the following data structure:假设以下数据结构:

lists = {
    'my_list1': ["Peter Parker", "Eddie Brock"],
    'my_list2': ["Harry Potter", "Severus Snape", "Dumbledore"]
}

and this argument:这个论点:

arg = "Harry Potter was very sad"

this comprehension will return names of all lists that contain any keyword from the argument:此理解将返回包含参数中任何关键字的所有列表的名称:

list_names = [
    list_name
    for list_name, keywords in lists.items()
    if any(kw in arg for kw in keywords)
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM