[英]Python: Nested dictionary - create if key doesn't exist, else sum 1
ESCENARIO场景
I am trying to count the number of times a word appears in a sentence, for a list of sentences.我正在尝试计算一个单词在句子中出现的次数,以获得句子列表。 Each sentence is a list of words.
每个句子都是一个单词列表。
I want the final dictionary to have a key for each word in the entire corpus, and a second key indicating the sentences in which they appear, with the value being the number of times it appears in it.我希望最终的字典为整个语料库中的每个单词都有一个键,第二个键指示它们出现的句子,值是它出现在其中的次数。
CURRENT SOLUTION当前解决方案
The following code works correctly:以下代码正常工作:
dfm = dict()
for i,sentence in enumerate(setences):
for word in sentence:
if word not in df.keys():
dfm[word] = dict()
if i not in dfm[word].keys():
dfm[word][i] = 1
else:
dfm[word][i] += 1
QUESTION问题
Is there any cleaner way to do it with python?有没有更清洁的方法可以用 python 做到这一点?
I have already gone through this and this where they suggest using:我已经经历了这个和他们建议使用的地方:
dic.setdefault(key,[]).append(value)
and,和,
d = defaultdict(lambda: defaultdict(dict))
I think they are good solution, but I can't figure out how to adapt that to my particular solution.我认为它们是很好的解决方案,但我不知道如何使其适应我的特定解决方案。
Thanks !谢谢 !
Say you have this input:假设你有这个输入:
sentences = [['dog','is','big'],['cat', 'is', 'big'], ['cat', 'is', 'dark']]
Your solution:您的解决方案:
dfm = dict()
for i,sentence in enumerate(sentences):
for word in sentence:
if word not in dfm.keys():
dfm[word] = dict()
if i not in dfm[word].keys():
dfm[word][i] = 1
else:
dfm[word][i] += 1
Defaultdict int:默认字典 int:
from collections import defaultdict
dfm2 = defaultdict(lambda: defaultdict(int))
for i,sentence in enumerate(sentences):
for word in sentence:
dfm2[word][i] += 1
Test:测试:
dfm2 == dfm # True
#{'dog': {0: 1},
# 'is': {0: 1, 1: 1, 2: 1},
# 'big': {0: 1, 1: 1},
# 'cat': {1: 1, 2: 1},
# 'dark': {2: 1}}
for cleaner version use Counter
对于更清洁的版本,请使用
Counter
from collections import Counter
string = 'this is america this is america'
x=Counter(string.split())
print(x)
output output
Counter({'this': 2, 'is': 2, 'america': 2})
if want some own code then如果想要一些自己的代码然后
copying input data (sentence) from @rassar从@rassar 复制输入数据(句子)
def func(list_:list):
dic = {}
for sub_list in list_:
for word in sub_list:
if word not in dic.keys():
dic.update({word:1})
else:
dic[word]+=1
return dic
sentences = [['dog','is','big'],['cat', 'is', 'big'], ['cat', 'is', 'dark']]
print(func(sentences))
output output
{'dog': 1, 'is': 3, 'big': 2, 'cat': 2, 'dark': 1}
from collections import Counter
sentences = ["This is Day", "Never say die", "Chat is a good bot", "Hello World", "Two plus two equals four","A quick brown fox jumps over the lazy dog", "Young chef, bring whisky with fifteen hydrogen ice cubes"]
sentenceWords = ( Counter(x.lower() for x in sentence.split()) for sentence in sentences)
#print result
print("\n".join(str(c) for c in sentenceWords))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.