简体   繁体   English

给定一个长字符串,在字典中找到匹配的字符串

[英]Given a long string, find the matching strings in the dictionary

Let's assume that we have a dictionary whose keys are the food names.假设我们有一个字典,其键是食物名称。

A user can input a food name string (ie "banana yogurt steak cheese cake"), and for the sake of simplicity, let's assume that everything is lower-case, and no food name appears more than once in the string.用户可以输入一个食物名称字符串(即“香蕉酸奶牛排芝士蛋糕”),为了简单起见,我们假设所有内容都是小写的,并且没有食物名称在字符串中出现多次。

What would be the most efficient way to find out what keys from the dictionary appeared in the food name string?找出字典中的哪些键出现在食物名称字符串中的最有效方法是什么? Let's assume that it's possible that no key appears at all, and it's also possible that food name string only consists of the dictionary keys.让我们假设可能根本没有键出现,也可能食物名称字符串仅由字典键组成。

This is for self-study purpose.这是为了自学。 Use of Suffix Trie seems to be a good option, but I'm not sure how to optimize it (since there are possibly multiple matches in this case) If there are well-known theorems or references for it, i would really appreciate it.使用Suffix Trie似乎是一个不错的选择,但我不确定如何优化它(因为在这种情况下可能有多个匹配项)如果有众所周知的定理或参考资料,我将不胜感激。

PS. PS。 Thanks for the answer, but the only problem is that the food name doesn't need to be one word long .感谢您的回答,但唯一的问题是食物名称不需要长一个字 Let's think of something like this.让我们想想这样的事情。

"mild coffee ice cream" “淡咖啡冰淇淋”

it's 4 word long string, but only 2 food names: mild coffee, ice cream它是 4 个字长的字符串,但只有 2 个食物名称:淡咖啡、冰淇淋

You can use set intersection for that purpose:您可以为此目的使用设置交集

set(food_string.split()) & food_dict.keys()

Note that dict.keys() , although not a set , supports the set API.请注意, dict.keys()虽然不是set ,但支持set API。

If no unambiguous boundary for food names can be identified (such as space如果无法识别食物名称的明确边界(例如空格), then you can use the following: ),那么您可以使用以下内容:

[k for k in food_dict if k in food_string]

Is this what you meant?这是你的意思吗?

dict = {
  "banana": "answer1",
  "steak": "answer1",
  "cheese": "answer1"
}
input = "banana yogurt steak cheese cake"
for word in input.split():
    print (word)
    if word in dict:
        print ("true")

Output: Output:

banana
true
yogurt
steak
true
cheese
true
cake

If it's not one word:如果不是一个字:

dict = {
  "banana milk shake": "answer1",
  "steak": "answer2",
  "ice cream": "answer3",
  "salad dressing": "answer4"
}
input = "banana milk shake yogurt steak cheese ice cream cake"
for key in dict:
    print (key)
    if key in input:
        print ("true")
    else:
        print ("false")

Output: Output:

banana milk shake
true
steak
true
ice cream
true
salad dressing
false

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在字典和列表之间查找匹配的字符串并用字符串替换匹配项 - Find matching strings between dictionary and list and replace matches with string 在数据库中查找匹配的单词和一个长字符串 - Find matching words in a database and a long string 根据给定的字符串查找匹配元素 - Find matching elements based on given string 将字符串的字符与字典的键匹配,如果匹配,则将字符串转换为键的值 - Matching characters of a string with keys of a dictionary and converting strings to values of keys if matched Python正则表达式用于匹配除给定字符串以外的所有字符串 - Python regex for matching all strings except a given string 如何在 Python 中使用正则表达式查找与特定字符串匹配的字符串 - How to find matching strings upto a specific string with regex in Python 查找长字符串中给定单词之前的 \\n 个数 - Find the number of \n before a given word in a long string 是否有更好的方法来查找出现在给定字典中的字符串的所有连续子字符串 - Is there a better way to find all the contiguous substrings of a string that appear in a given dictionary python 给定查询字符串找到一组相同开头的字符串 - python given query string find a set of strings with same beginning 字符串列表中的匹配字符串 - Matching String in a List of Strings
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM