[英]Unicoded string key error in python dict
I have such a code: 我有这样的代码:
corpus_file = codecs.open("corpus_en-tr.txt", encoding="utf-8").readlines()
corpus = []
for a in range(0, len(corpus_file), 2):
corpus.append({'src': corpus_file[a].rstrip(), 'tgt': corpus_file[a+1].rstrip()})
params = {}
for sentencePair in corpus:
for tgtWord in sentencePair['tgt']:
for srcWord in sentencePair['src']:
params[srcWord][tgtWord] = 1.0
Basically I am trying to create a dictionary of dictionary of float. 基本上,我正在尝试创建float字典。 But I get the following error:
但是我收到以下错误:
Traceback (most recent call last):
File "initial_guess.py", line 15, in <module>
params[srcWord][tgtWord] = 1.0
KeyError: u'A'
UTF-8 string as key in dictionary causes KeyError UTF-8字符串作为字典中的键会导致KeyError
I checked the case above, but it doesn't help. 我检查了上述情况,但这没有帮助。
Basically I don't understand why unicoded string 'A' is not allowed in python to be a key value? 基本上我不明白为什么python中不允许未编码的字符串'A'作为键值? Is there any way to fix it?
有什么办法可以解决?
Your params
dict is empty. 您的
params
字典是空的。
You can use tree for that: 您可以为此使用树:
from collections import defaultdict
def tree():
return defaultdict(tree)
params = tree()
params['any']['keys']['you']['want'] = 1.0
Or a simpler defaultdict
case without tree
: 或更简单的没有
tree
defaultdict
情况:
from collections import defaultdict
params = defaultdict(dict)
for sentencePair in corpus:
for tgtWord in sentencePair['tgt']:
for srcWord in sentencePair['src']:
params[srcWord][tgtWord] = 1.0
If you don't want to add anything like that, then just try to add dict to params
on every iteration: 如果您不想添加这样的内容,那么只需在每次迭代中将dict添加到
params
:
params = {}
for sentencePair in corpus:
for srcWord in sentencePair['src']:
params.setdefault(srcWord, {})
for tgtWord in sentencePair['tgt']:
params[srcWord][tgtWord] = 1.0
Please note, that I've changed the order of for
loops, because you need to know srcWord
first. 请注意,我已经更改了
for
循环的顺序,因为您首先需要了解srcWord
。
Otherwise you need to check key existence too often: 否则,您需要经常检查密钥的存在:
params = {}
for sentencePair in corpus:
for tgtWord in sentencePair['tgt']:
for srcWord in sentencePair['src']:
params.setdefault(srcWord, {})[tgtWord] = 1.0
You can just use setdefault
: 您可以只使用
setdefault
:
Replace 更换
params[srcWord][tgtWord] = 1.0
with 与
params.setdefault(srcWord, {})[tgtWord] = 1.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.