字典键值仅显示唯一结果而不是全部

Question

i have corpus_test then i'm upgrade him to list with split by words.我有corpus_test然后我将他升级为按单词拆分的列表。 i need have 2 dictionarys from this and len of text words.我需要 2 dictionarys和len文本单词。 problem is unique values.问题是唯一值。 i need all of them, even duplicates.我需要所有这些，甚至是重复的。

corpus_test = 'cat dog tiger tiger tiger cat dog lion'
corpus_test = [[word.lower() for word in corpus_test.split()]]
word_counts = defaultdict(int)
for rowt in corpus_test:
    for wordt in rowt:
        word_counts[wordt] += 1



        index_wordso = dict((i, word) for i, word in enumerate(rowt))

        word_indexso = dict((word, i) for i, word in enumerate(rowt)) 

        v_countso = len(index_wordso)

my code give me right outputs with index_wordso and v_countso :我的代码用index_wordso和v_countso给了我正确的输出：

index_wordso
#{0: 'cat',
 1: 'dog',
 2: 'tiger',
 3: 'tiger',
 4: 'tiger',
 5: 'cat',
 6: 'dog',
 7: 'lion'}


v_countso
#8

but word_indexso (inverse dict to index_wordso ) give's me not right output:但是word_indexso （与index_wordso反向dict ）给了我不正确的输出：

word_indexso
#{'cat': 5, 'dog': 6, 'tiger': 4, 'lion': 7}

that's give me only last values, not all.那只是给我最后的值，而不是全部。 i need all 8 values我需要所有 8 个值

Answer 1

Keys in a dictionary are unique, values are not.字典中的键是唯一的，值不是。 It's like a word dictionary: there can be multiple definitions of a word, but not multiple word listings.这就像一个单词词典：一个词可以有多个定义，但不能有多个词列表。

A workaround is using a list of tuples:解决方法是使用元组列表：

corpus_test = 'cat dog tiger tiger tiger cat dog lion'
corpus_test = [word.lower() for word in corpus_test.split()]
print([(a,b) for (a, b) in zip(corpus_test, range(len(corpus_test)))])

which results in这导致

[('cat', 0),
 ('dog', 1),
 ('tiger', 2),
 ('tiger', 3),
 ('tiger', 4),
 ('cat', 5),
 ('dog', 6),
 ('lion', 7)]

Keep in mind, though, that this is not a lookup table, and so you must loop through the elements (in some way) to find a speficic element.但是请记住，这不是查找表，因此您必须（以某种方式）遍历元素以查找特定元素。

Another method is to use a dictionary of lists:另一种方法是使用列表字典：

from collections import defaultdict

word_indexso = defaultdict(list)
corpus_test = 'cat dog tiger tiger tiger cat dog lion'.split()

for index, word in enumerate(corpus_test):
    word_indexso[word].append(index)

print(word_indexso)

which results in这导致

defaultdict(<class 'list'>, {'cat': [0, 5], 'dog': [1, 6], 'tiger': [2, 3, 4], 'lion': [7]})

which can be looked up with eg word_indexso["cat"] to get the list of numbers associated with the word.可以使用例如word_indexso["cat"]来word_indexso["cat"]与该单词相关联的数字列表。

字典键值仅显示唯一结果而不是全部

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-26 19:09:27

字典键值仅显示唯一结果而不是全部

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-26 19:09:27

解决方案1
1 已采纳 2020-03-26 19:09:27