简体   繁体   English

在python中查询集合的字典

[英]Dictionary with a query of sets in python

So i am trying to get the position of each word in a list, and print it in a dictionary that has the word for key and a set of integers where it belongs in the list. 因此,我试图获取列表中每个单词的位置,并将其打印在字典中,该字典中包含key单词和一组整数,它属于列表。

list_x = ["this is the first", "this is the second"]
my_dict = {}
for i in range(len(list_x)):
    for x in list_x[i].split():
        if x in my_dict:
            my_dict[x] += 1
        else:
            my_dict[x] = 1
print(my_dict)

This is the code i tried but this gives me the total number of how many time it appears in the list each word. 这是我尝试的代码,但这给了我每个单词在列表中出现多少次的总数。 What i am trying to get is this format: 我想得到的是这种格式:

{'this': {0, 1}, 'is': {0, 1}, 'the': {0, 1}, 'first': {0}, 'second': {1}}

As you can see this is the key and it appears once, in the "0" position and once in the "1" and .. Any idea how i might get to this point? 正如您所看到的,这是关键,它一次出现在“ 0”位置,一次出现在“ 1”,然后..知道我如何达到这一点吗?

Fixed two lines: 固定两行:

list_x = ["this is the first", "this is the second"]
my_dict = {}
for i in range(len(list_x)):
    for x in list_x[i].split():
        if x in my_dict:
            my_dict[x].append(i)
        else:
            my_dict[x] = [i]
print(my_dict)

Returns: 返回:

{'this': [0, 1], 'is': [0, 1], 'the': [0, 1], 'first': [0], 'second': [1]}

You can also do this with defaultdict and enumerate : 您也可以使用defaultdictenumerate

from collections import defaultdict
list_x = ["this is the first",
          "this is the second",
          "third is this"]
pos = defaultdict(set)
for i, sublist in enumerate(list_x):
    for word in sublist.split():
        pos[word].add(i)

Output: 输出:

>>> from pprint import pprint
>>> pprint(dict(pos))
{'first': {0},
 'is': {0, 1, 2},
 'second': {1},
 'the': {0, 1},
 'third': {2},
 'this': {0, 1, 2}}

The purpose of enumerate is to provide the index (position) of each string within list_x . 枚举的目的是提供list_x中每个字符串的索引(位置)。 For each word encountered, the position of its sentence within list_x will be added to the set for its corresponding key in the result, pos . 对于遇到的每个单词,其句子在list_x的位置将添加到结果中对应键pos的集合中。

Rather than using integers in your dict, you should use a set: 而不是在字典中使用整数,而应使用集合:

for i in range(len(list_x)):
    for x in list_x[i].split():
        if x in my_dict:
            my_dict[x].add(i)
        else:
            my_dict[x] = set([i])

Or, more briefly, 或者,更简单地说,

for i in range(len(list_x)):
    for x in list_x[i].split():
        my_dict.setdefault(x, set()).add(i)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM