我如何編寫一個程序，將字符串列表作為輸入並返回一個字典，其中包含匹配字符串的單詞索引

Question

規則：在字典中，每個鍵將是一個單詞 k，而值將是單詞 k 出現的輸入字符串的索引列表。

單詞應僅被視為小寫。 即 Hello 和 hello 應該被同等對待。

可以假設數據集將只包含字符串列表。 無需檢查數據集中元素的類型。

數據集中的字符串數據將是干凈的。 無需擔心清潔，即去除標點符號或數字。

在下面的示例中，該函數確定給定數據集中單詞的索引是什么。 dataset 是包含字符串的列表。

reverse_index 函數應該創建並返回字典。


dataset = [
    "Hello world",
    "This is the WORLD",
    "hello again"
 ]
res = reverse_index(dataset)

# This assertion checks if the result equals the expected dictinary
assert(res == {
    'hello': [0, 2],
    'world': [0, 1],
    'this': [1],
    'is': [1],
    'the': [1],
    'again':[2]
  })

我不確定接下來要做什么，但這就是我開始的方式

dataset = [
    "Hello world",
    "This is the WORLD",
    "hello again"
 ] 

def reverse_index(dataset):

Answer 1

你可以試試這個方法

def reverse_index(data):
    res = dict()
    for i in range(len(data)):
        for word in map(str.lower,data[i].split()):
            if word not in res:
                res[word] = [i,]
            else:
                res[word].append(i)
    return res

輸出：

{
    'hello': [0, 2],
    'world': [0, 1],
    'this': [1],
    'is': [1],
    'the': [1],
    'again':[2]
}

Answer 2

您可以使用collections.defaultdict作為基礎和一個小循環：

from collections import defaultdict
res = defaultdict(list)
for i,s in enumerate(dataset):
    for w in set(map(str.lower, s.split())):
        res[w].append(i)
dict(res)

輸出：

{'hello': [0, 2],
 'world': [0, 1],
 'is': [1],
 'the': [1],
 'this': [1],
 'again': [2]}

我如何編寫一個程序，將字符串列表作為輸入並返回一個字典，其中包含匹配字符串的單詞索引

問題描述

2 個解決方案

解決方案1
1 2021-10-31 16:44:43

解決方案2
1 已采納 2021-10-31 16:47:00

我如何編寫一個程序，將字符串列表作為輸入並返回一個字典，其中包含匹配字符串的單詞索引

問題描述

2 個解決方案

解決方案1 1 2021-10-31 16:44:43

解決方案2 1 已采納 2021-10-31 16:47:00

解決方案1
1 2021-10-31 16:44:43

解決方案2
1 已采納 2021-10-31 16:47:00