简体   繁体   English

澄清行为:collections.defaultdict vs dict.setdefault

[英]clarify behaviour: collections.defaultdict vs dict.setdefault

dict provides .setdefault() , which will allow you to assign values, of any type, to missing keys on the fly: dict提供.setdefault() ,它允许你为丢失的键分配任何类型的值:

>>> d = dict()
>>> d.setdefault('missing_key', [])
[]
>>> d
{'missing_key': []}

Whereas, if you use defaultdict to accomplish the same task, then the default value is generated on demand whenever you try to access or modify a missing key:然而,如果您使用defaultdict来完成相同的任务,那么每当您尝试访问或修改缺少的键时,都会按需生成默认值:

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d['missing_key']
[]
>>> d
defaultdict(<class 'list'>, {'missing_key': []})

However, the following piece of code implemented with defaultdict raises a KeyError instead of creating the item with default value, {} :但是,使用defaultdict实现的以下代码会引发KeyError而不是使用默认值{}创建项目:

trie = collections.defaultdict(dict)
for word in words:
    t = trie
    for c in word:
        t = t[c]
    t["*"] = word

Using .setdefault() works ok:使用.setdefault()工作正常:

trie = {}
for word in words:
    t = trie
    for c in word:
        t = t.setdefault(c, {})
    t["*"] = word

checking before access, works ok too:访问前检查,也可以:

trie = {}
for word in words:
    t = trie
    for c in word:
        if c not in t:
           t[c] = {}
        t = t[c]
    t["*"] = word

What am I missing when using collections.defaultdict() ?使用collections.defaultdict()时我缺少什么?

NB I am trying to build a Trie structure out of a list of words.注意我正在尝试从单词列表中构建一个Trie结构。 For example:例如:

words = ["oath", "pea", "eat", "rain"]
trie = {'o': {'a': {'t': {'h': {'*': 'oath'}}}}, 'p': {'e': {'a': {'*': 'pea'}}}, 'e': {'a': {'t': {'*': 'eat'}}}, 'r': {'a': {'i': {'n': {'*': 'rain'}}}}}

In your first example, when you do t = t[c], t becomes a regular empty dict (because that's what you tell the defaultdict to generate in the definition of trie ).在您的第一个示例中,当您执行 t = t[c] 时,t 成为常规的空dict (因为这是您在trie的定义中告诉defaultdict生成的内容)。

Let's run through the loop with your example word "oath" :让我们用您的示例单词"oath"遍历循环:

1) t = trie, word = "oath"
2) c = "o"
3) t = t[c]
  3.1) evaluation of t[c] # "o" is not in trie, so trie generates an empty dict at key "o" and returns it to you
  3.2) assignment to t -> t is now the empty dict. If you were to run (t is trie["o"]), it would evaluate to True after this line
4) c = "a"
5) t = t[c]
  5.1) Evaluation of t[c] -> "a" is not in the dict t. This is a regular dict, raise KeyError.

Unfortunately, I can't think of a way to use defaultdict here ( but Marius could, see this answer ), because of the arbitrary nesting of a Trie.不幸的是,我想不出在这里使用defaultdict的方法(但 Marius 可以,看到这个答案),因为 Trie 的任意嵌套。 You'd need to define the trie as a defaultdict that, in case of missing key, generates a default dict that itself generates a default dict in case of missing key , recursively until maximum depth (which, in principle, is unknown).您需要将 trie 定义为 defaultdict ,在缺少 key 的情况下,生成一个默认 dict ,它本身在缺少 key 的情况下生成一个默认 dict ,递归直到最大深度(原则上是未知的)。

IMO, the best way to implement this is with the setdefault as you did in your second example. IMO,实现这一点的最佳方法是使用setdefault ,就像您在第二个示例中所做的那样。

GPhilo's answer is perfectly fine and, indeed, I also believe that setdefault is the proper way of doing it. GPhilo 的回答非常好,事实上,我也相信setdefault是正确的做法。

However, if one prefers to use the defaultdict , it can be easily achieved like this:但是,如果更喜欢使用defaultdict ,可以像这样轻松实现:

def ddnestedconst(): 
    """collections.defaultdict nested-ish constructor."""
    return collections.defaultdict(ddnestedconst) 

# ...

trie = collections.defaultdict(ddnestedconst)
# or, being the same:
#trie = ddnestedconst()

for word in words:
    t = trie
    for c in word:
        t = t[c]
    t["*"] = word

May feel a little bit strange at first, but I find it perfectly readable and semantically accurate.一开始可能会觉得有点奇怪,但我觉得它完全可读且语义准确。

Reached that point, you may also prefer to create a new class altogether, inspired by defaultdict , but including all the semantics and specific behaviour that you expect from it.达到这一点,您可能还希望完全创建一个新的 class ,灵感来自defaultdict ,但包括您期望从中获得的所有语义和特定行为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM