简体   繁体   English

获得字典的补充

[英]Obtaining the complement of dictionary

I have a dictionary letters 我有一本字典

letterstoProbabilityMap={"aaa":0.4,"bbb":0.7,"ccc":01}

for which I have three letter strings and their probability of occurring(I have shortened the dictionary). 我有三个字母字符串和它们发生的概率(我缩短了字典)。 I am assigning these probabilities based on some training data. 我根据一些训练数据分配这些概率。 But I also want to assign a probability to strings/keys I haven't seen. 但我也想为一些我没见过的字符串/键分配一个概率。 eg "aaa". 例如“aaa”。 Since all my keys are within the set aaa-zzz. 因为我的所有键都在set aaa-zzz中。 Is there a quick way for me to obtain the non assigned/complement and assign a value quickly. 有没有快速的方法让我获得非分配/补充并快速分配值。 (I understand my question is quite abstract.) (我明白我的问题很抽象。)

EDIT The value is not fixed it is actually a la place probability. 编辑值不固定,实际上是一个位置概率。 Below is a code snippet I use to compute the probabilities I do know The point is I reserve a probability mass which I will then assign to the three letter strings I haven't seen(because I know all strings are between aaa-zzz) 下面是我用来计算概率的代码片段我知道这一点是我保留一个概率质量,然后我将分配给我没有看到的三个字母的字符串(因为我知道所有字符串都在aaa-zzz之间)

for trigram in sorted(threeletter_counts.keys()):
        numerator=threeletter_counts[trigram]+1 
        denominator=twoletter_counts[trigram[:2]]+30
        prob=numerator/denominator

You could go through all strings and use setdefault : 您可以浏览所有字符串并使用setdefault

for letters in itertools.product(string.ascii_lowercase, repeat=3):
    letterstoProbabilityMap.setdefault(''.join(letters),
                                       computeMissingProbability(letters))

Though if the calculation is expensive and would go to waste often because most keys already exist, better check first: 虽然如果计算费用昂贵并且经常因为大多数密钥已经存在而浪费,那么最好先检查一下:

for letters in itertools.product(string.ascii_lowercase, repeat=3):
    key = ''.join(letters)
    if key not in letterstoProbabilityMap:
        letterstoProbabilityMap[key] = computeMissingProbability(letters)

Or maybe use a defaultdict , if that works for you: 或者也许使用defaultdict ,如果它适合你:

fullMap = collections.defaultdict(lambda: 0.123, letterstoProbabilityMap)

If the default value is just 0.0: 如果默认值仅为0.0:

fullMap = collections.defaultdict(float, letterstoProbabilityMap)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM