简体   繁体   English

dict 和 collections.defaultdict 有什么区别?

[英]What is the difference between dict and collections.defaultdict?

I was checking out Peter Norvig's code on how to write simple spell checkers.我正在查看 Peter Norvig 关于如何编写简单拼写检查器的代码 At the beginning, he uses this code to insert words into a dictionary.一开始,他使用此代码将单词插入字典。

def train(features):
    model = collections.defaultdict(lambda: 1)
    for f in features:
        model[f] += 1
    return model

What is the difference between a Python dict and the one that was used here? Python 字典和这里使用的字典有什么区别? In addition, what is the lambda for?另外, lambda是干什么用的? I checked the API documentation here and it says that defaultdict is actually derived from dict but how does one decide which one to use?我在这里查看了 API 文档,它说 defaultdict 实际上是从 dict 派生的,但是如何决定使用哪一个?

The difference is that a defaultdict will "default" a value if that key has not been set yet.不同之处在于,如果尚未设置该键,则defaultdict将“默认”一个值。 If you didn't use a defaultdict you'd have to check to see if that key exists, and if it doesn't, set it to what you want.如果您没有使用defaultdict ,则必须检查该键是否存在,如果不存在,请将其设置为您想要的。

The lambda is defining a factory for the default value. lambda 正在为默认值定义工厂。 That function gets called whenever it needs a default value.每当需要默认值时,都会调用 function 。 You could hypothetically have a more complicated default function.您可以假设有一个更复杂的默认 function。

Help on class defaultdict in module collections:

class defaultdict(__builtin__.dict)
 |  defaultdict(default_factory) --> dict with default factory
 |  
 |  The default factory is called without arguments to produce
 |  a new value when a key is not present, in __getitem__ only.
 |  A defaultdict compares equal to a dict with the same items.
 |  

(from help(type(collections.defaultdict())) ) (来自help(type(collections.defaultdict()))

{}.setdefault is similar in nature, but takes in a value instead of a factory function. {}.setdefault本质上类似,但接受一个值而不是工厂 function。 It's used to set the value if it doesn't already exist... which is a bit different, though.如果该值尚不存在,则用于设置该值……不过,这有点不同。

Courtesy:- https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/礼貌:- https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/

Using Normal dict使用普通字典

d={}
d['Apple']=50
d['Orange']=20
print(d['Apple'])
print(d['Grapes'])# This gives Key Error

We can avoid this KeyError by using defaulting in normal dict as well, let see how we can do it我们也可以通过在普通字典中使用默认值来避免这个 KeyError,让我们看看如何做到这一点

d={}
d['Apple']=50
d['Orange']=20
print(d['Apple'])
print(d.get('Apple'))
print(d.get('Grapes',0)) # DEFAULTING

Using default dict使用默认字典

from collections import defaultdict
d = defaultdict(int) ## inside parenthesis we say what should be the default value.
d['Apple']=50
d['Orange']=20
print(d['Apple'])
print(d['Grapes']) ##→ This gives Will not give error

Using an user defined function to default the value使用用户定义的 function 来默认值

from collections import defaultdict
def mydefault():
        return 0

d = defaultdict(mydefault)
d['Apple']=50
d['Orange']=20
print(d['Apple'])
print(d['Grapes'])

Summary概括

  1. Defaulting in normal dict is on case to case basis and in defaultdict we can provide default in general manner普通字典中的默认值视具体情况而定,在默认字典中,我们可以以一般方式提供默认值

  2. Efficiency of using defaulting by defaultdict is two time greater than defaulting with normal dict.通过 defaultdict 使用默认值的效率是使用普通 dict 进行默认值的两倍。 You can refer below link to know better on this performance testing https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/您可以参考以下链接以更好地了解此性能测试https://shirishweb.wordpress.com/2017/05/06/python-defaultdict-versus-dict-get/

Use a defaultdict if you have some meaningful default value for missing keys and don't want to deal with them explicitly.如果您对缺少的键有一些有意义的默认值并且不想明确处理它们,请使用 defaultdict。

The defaultdict constructor takes a function as a parameter and constructs a value using that function. defaultdict 构造函数将 function 作为参数,并使用该 function 构造一个值。

lambda: 1

is the same as the parameterless function f that does this与执行此操作的无参数 function f 相同

def f():
 return 1

I forgot the reason the API was designed this way instead of taking a value as a parameter.我忘记了 API 以这种方式设计的原因,而不是将值作为参数。 If I designed the defaultdict interface, it would be slightly more complicated, the missing value creation function would take the missing key as a parameter.如果我设计了defaultdict接口,会稍微复杂一些,缺失值创建function会把缺失的key作为参数。

Let's deep dive into Python dictionary and Python defaultdict() class让我们深入了解 Python 字典和 Python defaultdict() class

Python Dictionaries Python 字典

Dict is one of the data structures available in Python which allows data to be stored in the form of key-value pairs. Dict 是 Python 中可用的数据结构之一,它允许以键值对的形式存储数据。

Example:例子:

d = {'a': 2, 'b': 5, 'c': 6}

Problem with Dictionary字典问题

Dictionaries work well unless you encounter missing keys.除非您遇到丢失的键,否则字典会很好地工作。 Suppose you are looking for a key-value pair where there is no value in the dictionary - then you might encounter a KeyError problem.假设您正在寻找字典中没有值的键值对 - 那么您可能会遇到KeyError问题。 Something like this:像这样的东西:

d = {'a': 2, 'b': 5, 'c': 6}
d['z']  # z is not present in dict so it will throw a error

You will see something like this:你会看到这样的东西:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
    d['z'] 
KeyError: 'z'

Solution to the above problem解决上述问题

To overcome the above problem we can use different ways:为了克服上述问题,我们可以使用不同的方法:

Using inbuilt functions使用内置函数

setdefault

If the key is in the dictionary, return its value.如果key在字典中,则返回其值。 If not, insert a key with a value of default and return default .如果不是,则插入一个值为default的键并返回default default defaults to None : default默认为None

>>> d = {'a' :2, 'b': 5, 'c': 6}
>>> d.setdefault('z', 0)
0  # returns 0 
>>> print(d)  # add z to the dictionary
{'a': 2, 'b': 5, 'c': 6, 'z': 0}

get

Return the value for key if the key is in the dictionary, else default .如果键在字典中,则返回key的值,否则返回default If the default is not given, it defaults to None , so that this method never raises a KeyError :如果未给出默认值,则默认为None ,因此此方法永远不会引发KeyError

>>> d = {'a': 2, 'b': 5, 'c': 6}
>>> d.get('z', 0)
0  # returns 0 
>>> print(d)  # Doesn't add z to the dictionary unlike setdefault
{'a': 2, 'b': 5, 'c': 6}

The above 2 methods are the solutions to our problem.以上2种方法是解决我们问题的方法。 It never raises KeyError .它永远不会引发KeyError Apart from the above 2 methods, Python also has a collections module that can handle this problem.除了以上2种方法,Python还有一个collections模块可以解决这个问题。 Let's dig deep into the defaultdict in the collections module:让我们深入挖掘 collections 模块中的defaultdict

defaultdict

defaultdict can be found in the collections module of Python. defaultdict可以在 Python 的 collections 模块中找到。 You can use it using:您可以使用它:

from collections import defaultdict

d = defaultdict(int)

defaultdict constructor takes default_factory as an argument that is a callable. defaultdict构造函数将default_factory作为可调用的参数。 This can be for example:例如,这可以是:

  • int : default will be an integer value of 0 int :默认为 integer 值为0

  • str : default will be an empty string "" str : 默认为空字符串""

  • list : default will be an empty list [] list : 默认为空列表[]

Code:代码:

from collections import defaultdict

d = defaultdict(list)
d['a']  # access a missing key and returns an empty list
d['b'] = 1 # add a key-value pair to dict
print(d)

output will be defaultdict(<class 'list'>, {'b': 1, 'a': []}) output 将是defaultdict(<class 'list'>, {'b': 1, 'a': []})

The defaultdict works the same as the get() and setdefault() methods, so when to use them? defaultdictget()setdefault()方法的工作方式相同,那么什么时候使用它们呢?

When to use get()何时使用get()

If you specifically need to return a certain key-value pair without KeyError and also it should not update in the dictionary - then dict.get is the right choice for you.如果您特别需要在没有KeyError的情况下返回某个键值对并且它不应该在字典中更新 - 那么dict.get是您的正确选择。 It returns the default value specified by you but does not modify the dictionary.它返回您指定的默认值,但不修改字典。

When to use setdefault()何时使用setdefault()

If you need to modify the original dictionary with a default key-value pair - then setdefault is the right choice.如果您需要使用默认键值对修改原始字典 - 那么setdefault是正确的选择。

When to use defaultdict何时使用defaultdict

setdefault method can be achieved using defaultdict but instead of providing default value every time in setdefault , we can do it at once in defaultdict . setdefault方法可以使用defaultdict来实现,但不是每次都在setdefault中提供默认值,我们可以在defaultdict中一次完成。 Also, setdefault has a choice of providing different default values for the keys.此外, setdefault可以选择为键提供不同的默认值。 Both have their own advantages depending on the use case.两者都有自己的优势,具体取决于用例。

When it comes to efficiency:在效率方面:

defaultdict > setdefault() or get() defaultdict > setdefault()get()

defaultdict is 2 times faster than get() ! defaultdictget()快 2 倍!

You can check the results here .你可以在这里查看结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM