简体   繁体   English

如何创建一个字典,其中包含文本中的单词作为键和“它出现的子列表”作为值?

[英]How can I create a dictionary that contains words from a text as keys and the "sublist in which it appears " as values?

My question is quite similar to others but here my list is kind of special.我的问题与其他人非常相似,但这里我的清单有点特别。

I have to create a search engine in Python.我必须用 Python 创建一个搜索引擎。 For that, I have to create a dictionary as I said in the title.为此,我必须像标题中所说的那样创建一个字典。

Let me give you the context:让我给你上下文:

I have basically a text which is made of several parts separated by "[==========]".我基本上有一个由“[==========]”分隔的几个部分组成的文本。

Like :喜欢 :

  [blablabla][blabliblou]
  [==========]
  [blablablou][blibloubla]
  [=========]
  [oubabababa][baboulila]

I created an algorithm that combine these lists until we "hit" a "=========="and put them into a single list where [blablabla blabliblou] is list[O], [blablablou][blibloubla] is list[1] etc...我创建了一个组合这些列表的算法,直到我们“击中”一个“==========”并将它们放入一个列表中,其中 [blablabla blabliblou] 是 list[O], [blablablou][blibloubla]是列表 [1] 等...

The algorithm :算法:

  import re
  file = open("mytext.txt","r",encoding="utf-8")
  list = []
  dico = {}
  d = file.read()

  x = re.split(r"=+", d)
  for i in range(len(x)):
  liste.append(x[i])

I have an output like :我有一个输出,如:

  [ [blablabla blabliblou] [blablablou blibloubla] [oubabababa baboulila] ]

But now the second step is to create a dictionary that has all the words of the text as key and the sublist(s) that contain them as value(s).但是现在第二步是创建一个字典,该字典将文本中的所有单词作为键,并将包含它们的子列表作为值。

I tried to use a conditional loop as the following :我尝试使用条件循环如下:

  import re
  file = open("mytext.txt","r",encoding="utf-8")
  list = []
  numd = 0
  dico = {}
  d = file.read()

  for x in file:
  x = re.split(r"=+", d)
     for i in range(len(x)):
     list.append(x[i])
     numd =+ 1
        for word in list:
           if word in dico:
               if numd not in dico[word]:
                  dico[word].append(numd)
           else:
              dico[word] = [numd]

The expected output is :预期的输出是:

    {blablabla:1, blablilou:1, blablablou:2, blibloubla:2, oubabababa:3,
baboulila:3}

but my list is still empty.但我的清单仍然是空的。

Thank you in advance for your reply!预先感谢您的回复! I would be so grateful我会很感激

How about this?这个怎么样?

from collections import defaultdict
all_dict = defaultdict(list)
for index, val in enumerate(x):
    for value in val:
        if value not in all_dict:
            all_dict[value].append(index)

print(all_dict)

It will get you the expected output:它将为您提供预期的输出:

defaultdict(list,
            {'blablabla': [0],
             'blabliblou': [0],
             'blablablou': [1],
             'blibloubla': [1],
             'oubabababa': [2],
             'baboulila': [2]})
from collections import defaultdict

l = [ ["blablabla", "blabliblou"], ["blablablou", "blibloubla"], ["oubabababa", "baboulila"] ]

d = defaultdict(list)
for i, line in enumerate(l):
    [d[word].append(i) for word in line]

print(dict(d))
>>> {'blablabla': [0], 'oubabababa': [2], 'blablablou': [1], 'blabliblou': [0], 'baboulila': [2], 'blibloubla': [1]}

This is the code that I have so far :这是我到目前为止的代码:

  import re 
  from collections import defaultdict 
  file = open("mytext.txt","r",encoding="utf-8") 
  l = [] 
  d = file.read() 

  x = re.split(r"=+", d) 
  for i in range(len(x)): 
     l.append(x[i]) 

  d = defaultdict(list) 
 for i, line in enumerate(l): 
    [d[word].append(i) for word in line]

It seems to work but the keys are the letters and the values are the sublists where the letter occur它似乎有效,但键是字母,值是字母出现的子列表

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将列表列表转换为键为整数且值为 integer 所属的子列表的索引的字典? - How can I convert a list of lists to a dictionary whose keys are integers and values are the index of the sublist to which the integer belongs? 如何从两个列表创建字典,一个列表是键,另一个包含嵌套值? - How can I create a dictionary from two lists, one list is the keys, the other contains nested values? 我如何创建一个Python字典,通过tkinter文本框从用户那里获取其值和键? - How can I create a python dictionary that gets its values and keys from the user through tkinter text boxes? 如何从用户输入中搜索单词列表的文本文件,并打印包含这些单词的行? - how can i search a text file of list of words from user input and print the line which contains these words? 如何在字典中查找仅包含列表中所有值的所有键 - How to find all keys within a dictionary which contains only ALL values from a list 如何创建嵌套字典键并从命名空间键值对列表中为其分配值? - how can I create nested dictionary keys and assign them values from a list of namespaced key value pairs? 我如何从一个列表中创建一个字典,其中键是索引,值是列表的一个一个的实际元素? - How can I create a dictionary from a list where the keys are the indexes and the values are the actual elements of the list one by one? 如何将子列表的分数分配给单词并创建新词典 - how to assign the score of the sublist to the words and to create a new dictionary 如何从单独的键和值列表中制作字典? - How can I make a dictionary from separate lists of keys and values? 如何从同一个 dataframe 中的字典键创建列? - How can I create column from dictionary keys in same dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM