單詞匹配字符串

Question

我正在嘗試將某個單詞/字符串與字符串匹配到一個類別。 我創建了一個簡單的示例來說明我要做什么以及遇到的問題。

我正在嘗試根據單詞進行匹配，並為類別指定一個匹配點，然后選擇匹配度最高的類別。

一個以上的問題可能具有最高的價值，而我的代碼只是從列表中選擇第一個。 讓他們獲得兩項，然后再次開始檢查字典似乎是錯誤的。

如果我朝着正確的方向前進，或者有更好的方法，有人可以建議我嗎？

categorys = {
    'fruits': ['apple', 'banana', 'orange', 'pear'],
    'chocolate': ['mars', 'kitkat', 'areo'],
    'drinks': ['coffee', 'tea', 'orange', 'coke']
}

# create a diction of points for each category and set to 0
points = {}
for key, value in categorys.items():
    points[key] = 0

# calulate points for category
# key = category, value = style
for key, values in categorys.items():
    if 'orange' in key.lower() or 'drink' in values:
        points[key] += 1

# get category with the highest point although it just grabs the first item
calculated_category = max(points.iterkeys(), key=(lambda key: points[key]))
print calculated_category

編輯

更新了答案中的代碼

    categorys = {
    'fruits': ['apple', 'banana', 'orange', 'pear'],
    'chocolate': ['mars', 'kitkat', 'areo'],
    'drinks': ['coffee', 'tea', 'orange', 'coke']
}

# create a diction of points for each category and set to 0
points = {}
for key, value in categorys.items():
    points[key] = 0

# calulate points for category
# key = category, value = style
for key, values in categorys.items():
    if 'drink' in key.lower()
        points[key] += 1
    if 'orange' in values:
        points[key] += 1

# get category with the highest point although it just grabs the first     item
# max(points.iterkeys(), key=(lambda key: points[key]))
max_value = max(points.values())
[k for k, v in points.iteritems() if v == max_value]

Answer 1

您可以使用max獲取最大points值，然后使用列表推導獲取所有具有該值的項目。

max_value = max(points.values())
calculated_category = [k for k, v in points.iteritems() if v == max_value]
print calculated_category

輸出是['fruits', 'drinks', 'chocolate'] ，在您的示例中，它們似乎都具有相同的points值0 ； 不知道這是故意的還是其他問題。 如果將條件更改為if 'drink' in key.lower() or 'orange' in values: ，這似乎更有意義，則輸出為['fruits', 'drinks'] 。

Answer 2

我正在考慮您的意見將是：

水果，香蕉
飲料，茶...等

您可以按照以下步驟進行：

cat, item = raw_input()  #you enter fruits, banana
for key, values in categorys.items():
    if cat == key.lower() and item in values:
        points[key] += 1

# get category with the highest point although it just grabs the first item
calculated_category = max(points.items(), key = lambda item_tuple: item_tuple[1])
print calculated_category

Answer 3

只是代碼中的一些附加點：

字典理解以創建字典：

演示：

>>> points = { key:0 for key in categorys }
>>> points
{'fruits': 0, 'drinks': 0, 'chocolate': 0}
>>>

使用收集模塊創建默認的int字典

無需為每個鍵分配0值。 鏈接

演示：

>>> import collections
>>> points = collections.defaultdict(int)
>>> points["drink"]
0
>>> points["drink"] += 1
>>> points["drink"]
1

使用iteritems()方法迭代鍵和值字典（ iteams()將返回元組列表）

Answer 4

如果您要對類別進行測試，例如"orange"和"drink" ，那么我建議您再做一個字典，將這些詞映射到類別列表（與當前字典相反）。 這將使您進行快速測試，即可查看單詞的類別（如果有），而無需進行大量迭代。

from collections import Counter

categorys = {
    'fruits': ['apple', 'banana', 'orange', 'pear'],
    'chocolate': ['mars', 'kitkat', 'areo'],
    'drinks': ['coffee', 'tea', 'orange', 'coke']
}

words_to_cats = {}
for cat, words in categorys.items():
    for word in words:
        words_to_cats.setdefault(word, []).append(cat)

def find_best_category(iterable_of_words):
    score = Counter()
    for word in iterable_of_words:
        score.update(words_to_cats.get(word, [])) # count matches in the word list

        for cat in categorys: # check for partial matches in the category name
            if word in cat:
                score[cat] += 1

    return score.most_common(1)

請注意，這只會返回一個項目，如果有確切的並列，它將在得分最高的項目中任意選擇一個獲勝類別。 在您的示例中（輸入['orange', 'drink'] ），沒有領帶，它將選擇"drinks" ，因為兩個詞都與該類別匹配（一個詞與“ orange”和一個與類別名稱部分匹配）。

單詞匹配字符串

問題描述

4 個解決方案

解決方案1
0 已采納 2015-04-14 10:30:41

解決方案2
0 2015-04-14 10:31:59

解決方案3
0 2015-04-14 11:26:58

解決方案4
0 2015-04-14 21:21:08

單詞匹配字符串

問題描述

4 個解決方案

解決方案1 0 已采納 2015-04-14 10:30:41

解決方案2 0 2015-04-14 10:31:59

解決方案3 0 2015-04-14 11:26:58

解決方案4 0 2015-04-14 21:21:08

解決方案1
0 已采納 2015-04-14 10:30:41

解決方案2
0 2015-04-14 10:31:59

解決方案3
0 2015-04-14 11:26:58

解決方案4
0 2015-04-14 21:21:08