简体   繁体   English

python字典中的内存分配是如何工作的?

[英]How does the memory allocation work in python dictionaries?

I want to understand how works the memory allocation in python when adding new data to a dictionary.我想了解在向字典中添加新数据时,python 中的内存分配是如何工作的。 In the code below, I was waiting that every new added data was stacked at the end, however it does not happen.在下面的代码中,我一直在等待每个新添加的数据都在最后堆叠,但是它并没有发生。

repetitions = {}
for item in new_deltas:
    list_aux = []
    if float(item[1]) <= 30:
        if float(item[0]) in repetitions:
            aux = repetitions[float(item[0])]
            aux.append(item[1])
            repetitions[float(item[0])] = aux
        else:
            list_aux.append(item[1])
            repetitions[float(item[0])] = list_aux
    print(repetitions)

The results I got are as below.我得到的结果如下。 Thus, I would like to understand why the new appended data is not added at the end of the stack, it is added in the middle of it.因此,我想了解为什么不将新的附加数据添加到堆栈的末尾,而是添加到堆栈的中间。

My input data is:我的输入数据是:

new_deltas = [[1.452, 3.292182683944702], [1.449, 4.7438647747039795], [1.494, 6.192960977554321], [1.429, 7.686920166015625]] 

The print line outputs:打印行输出:

{1.452: [3.292182683944702]}
{1.452: [3.292182683944702], 1.449: [4.7438647747039795]}
{1.452: [3.292182683944702], 1.494: [6.192960977554321], 1.449: [4.7438647747039795]}
{1.429: [7.686920166015625], 1.452: [3.292182683944702], 1.494: [6.192960977554321], 1.449: [4.7438647747039795]}

Short answer简答

Dicts are implemented as hash tables rather than stacks.字典被实现为哈希表而不是堆栈。

Without additional measures that tends to scramble the order of keys没有倾向于打乱密钥顺序的额外措施

Hash Tables哈希表

Prior to Python 3.6, the ordering in a dictionary was randomized by a hash function.在 Python 3.6 之前,字典中的排序由散列函数随机化。 Roughly, here's how it worked:粗略地说,它是如何工作的:

d = {}        # Make a new dictionary
              # Internally 8 buckets are formed:
              #    [ [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] ]
d['a'] = 10   # hash('a') % s gives perhaps bucket 5:
              #    [ [ ] [ ] [ ] [ ] [ ] [('a', 10)] [ ] [ ] ]
d['b'] = 20   # hash('b') % s gives perhaps bucket 2:
              #    [ [ ] [ ] [('b', 20)] [ ] [ ] [('a', 10)] [ ] [ ] ]

So, you can see the ordering of this dict would put 'b' before 'a' because the hash function put 'b' in an earlier bucket.所以,你可以看到这个字典的排序会把'b'之前, 'a' ,因为哈希函数把'b'在前面的水桶。

Newer hash tables that remember insertion order记住插入顺序的较新哈希表

Starting in Python 3.6, there was a stack added as well.从 Python 3.6 开始,还添加了一个堆栈。 See this proof-of-concept for a better idea of how that works.请参阅此概念验证以更好地了解其工作原理。

Accordingly, dicts started to remember insertion order and this behavior became guaranteed in Python 3.7 and later.因此,dicts 开始记住插入顺序,并且这种行为在 Python 3.7 及更高版本中得到保证。

Use OrderedDict on older Python implementations在旧的 Python 实现上使用 OrderedDict

Prior to 3.7, you can use collections.OrderedDict() to get the same effect.在 3.7 之前,您可以使用collections.OrderedDict()来获得相同的效果。

Deeper dive更深的潜水

For those interested in knowing more about how it works, I have a 37 minute video that shows from first principles all of the techniques used to make modern Python dictionaries.对于那些有兴趣了解更多有关它的工作原理的人,我有一个37 分钟的视频,该视频从第一原理展示了用于制作现代 Python 词典的所有技术。

Prior to Python 3.6, dictionaries were not ordered (see this stackoverflow thread for more on that).在 Python 3.6 之前,没有对字典进行排序(有关更多信息,请参阅stackoverflow 线程)。 If you are using Python 3.6 or lower (in CPython 3.6 the fact that order is maintained is an implementation detail, but with Python 3.7 it became a language feature), you can use the OrderedDict to get the behavior you want.如果您使用的是 Python 3.6 或更低版本(在 CPython 3.6 中,维护顺序的事实是一个实现细节,但在 Python 3.7 中它成为一种语言功能),您可以使用OrderedDict来获得您想要的行为。

For example, you could change the beginning of your code snippet to the following:例如,您可以将代码片段的开头更改为以下内容:

from collections import OrderedDict
repetitions = OrderedDict()
...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM