为什么Python 2.7中的dict定义比Python 3.x更快？

Question

I have encountered a (not very unusual) situation in which I had to either use a map() or a list comprehension expression. 我遇到过一种（非常不寻常的）情况，我不得不使用map()或列表推导表达式。 And then I wondered which one is faster. 然后我想知道哪一个更快。

This StackOverflow answer provided me the solution, but then I started to test it myself. 这个 StackOverflow答案为我提供了解决方案，但后来我开始自己测试。 Basically the results were the same, but I found an unexpected behavior when switching to Python 3 that I got curious about, and namely: 基本上结果是一样的，但是我在切换到Python 3时发现了一个意外的行为，我很好奇，即：

λ iulian-pc ~ → python --version
Python 2.7.6
λ iulian-pc ~ → python3 --version
Python 3.4.3

λ iulian-pc ~ → python -mtimeit '{}'                                                     
10000000 loops, best of 3: 0.0306 usec per loop
λ iulian-pc ~ → python3 -mtimeit '{}'                
10000000 loops, best of 3: 0.105 usec per loop

λ iulian-pc ~ → python -mtimeit 'dict()'
10000000 loops, best of 3: 0.103 usec per loop
λ iulian-pc ~ → python3 -mtimeit 'dict()'
10000000 loops, best of 3: 0.165 usec per loop

I had the assumption that Python 3 is faster than Python 2, but it turned out in several posts ( 1 , 2 ) that it's not the case. 我的假设是Python 3的是比Python 2快，但它在几个帖子横空出世（ 1 ， 2 ），它的情况并非如此。 Then I thought that maybe Python 3.5 will perform better at such a simple task, as they state in their README : 然后我想也许Python 3.5在这么简单的任务中表现得更好，就像他们在README ：

The language is mostly the same, but many details, especially how built-in objects like dictionaries and strings work, have changed considerably, and a lot of deprecated features have finally been removed. 语言大致相同，但许多细节，特别是字典和字符串等内置对象的工作方式都发生了很大变化，并且最终删除了许多已弃用的功能。

But nope, it performed even worse: 但不，它表现得更糟：

λ iulian-pc ~ → python3 --version
Python 3.5.0

λ iulian-pc ~ → python3 -mtimeit '{}'       
10000000 loops, best of 3: 0.144 usec per loop
λ iulian-pc ~ → python3 -mtimeit 'dict()'
1000000 loops, best of 3: 0.217 usec per loop

I've tried to dive into the Python 3.5 source code for dict , but my knowledge of C language is not sufficient to find the answer myself (or, maybe I even don't search in the right place). 我试图深入研究dict的Python 3.5源代码，但是我对C语言的了解还不足以自己找到答案（或者，我甚至不会在正确的地方搜索）。

So, my question is: 所以，我的问题是：

What makes the newer version of Python slower comparing to an older version of Python on a relatively simple task such as a dict definition, as by the common sense it should be vice-versa? 是什么让较新版本的Python比较旧版本的Python在一个相对简单的任务（如dict定义）上更慢，因为常识应该反之亦然？ I'm aware of the fact that these differences are so small that in most cases they can be neglected. 我知道这些差异非常小，在大多数情况下它们可以忽略不计。 It was just an observation that made me curious about why the time increased and not remained the same at least? 这只是一个观察让我好奇为什么时间增加而至少不是一样的？

Answer 1

Because nobody cares 因为没有人关心

The differences you are citing are on the order of tens or hundreds of nanoseconds. 您引用的差异大约为几十或几百纳秒。 A slight difference in how the C compiler optimizes register use could easily cause such changes (as could any number of other C-level optimization differences). C编译器优化寄存器使用的方式略有不同，很容易引起这种变化（与任何其他C级优化差异一样）。 That, in turn, could be caused by any number of things, such as changes in the number and usage of local variables in the C implementation of Python (CPython), or even just switching C compilers. 反过来，这可能是由许多事情引起的，例如Python的C实现（CPython）中局部变量的数量和使用的变化，甚至只是切换C编译器。

The fact is, nobody is actively optimizing for these small differences, so nobody is going to be able to give you a specific answer. 事实是，没有人积极优化这些微小的差异，所以没有人能够给你一个具体的答案。 CPython is not designed to be fast in an absolute sense. CPython的设计并不是绝对意义上的快速。 It is designed to be scalable . 它旨在可扩展。 So, for example, you can shove hundreds or thousands of items into a dictionary and it will continue to perform well. 因此，例如，您可以将数百或数千个项目推送到字典中，并且它将继续表现良好。 But the absolute speed of creating a dictionary simply isn't a primary concern of the Python implementors, at least when the differences are this small. 但是创建字典的绝对速度并不是Python实现者的主要关注点，至少当差异很小时。

Answer 2

As @Kevin already stated: 正如@Kevin已经说过：

CPython is not designed to be fast in an absolute sense. CPython的设计并不是绝对意义上的快速。 It is designed to be scalable 它旨在可扩展

Try this instead: 试试这个：

$ python -mtimeit "dict([(2,3)]*10000000)"
10 loops, best of 3: 512 msec per loop
$
$ python3 -mtimeit "dict([(2,3)]*10000000)"
10 loops, best of 3: 502 msec per loop

And again: 然后再次：

$ python -mtimeit "dict([(2,3)]*100000000)"
10 loops, best of 3: 5.19 sec per loop
$
$ python3 -mtimeit "dict([(2,3)]*100000000)"
10 loops, best of 3: 5.07 sec per loop

That pretty shows that you can't benchmark Python3 as losing against Python2 on such an insignificant difference. 这很漂亮地表明，你不能将Python3作为对这些微不足道的差异而对Python2的损失。 From the look of things, Python3 should scale better. 从外观上看，Python3应该更好地扩展。

Answer 3

Let's disassemble {} : 让我们拆解 {} ：

>>> from dis import dis
>>> dis(lambda: {})
  1           0 BUILD_MAP                0
              3 RETURN_VALUE

Python 2.7 implementation of BUILD_MAP Python 2.7实现BUILD_MAP

TARGET(BUILD_MAP)
{
    x = _PyDict_NewPresized((Py_ssize_t)oparg);
    PUSH(x);
    if (x != NULL) DISPATCH();
    break;
}

Python 3.5 implementation of BUILD_MAP Python 3.5实现BUILD_MAP

TARGET(BUILD_MAP) {
    int i;
    PyObject *map = _PyDict_NewPresized((Py_ssize_t)oparg);
    if (map == NULL)
        goto error;
    for (i = oparg; i > 0; i--) {
        int err;
        PyObject *key = PEEK(2*i);
        PyObject *value = PEEK(2*i - 1);
        err = PyDict_SetItem(map, key, value);
        if (err != 0) {
            Py_DECREF(map);
            goto error;
        }
    }

    while (oparg--) {
        Py_DECREF(POP());
        Py_DECREF(POP());
    }
    PUSH(map);
    DISPATCH();
}

It's little bit more code. 这是更多的代码。

EDIT: 编辑：

Python 3.4 implementation of BUILD_MAP id exactly the same as 2.7 (thanks @user2357112). Python 3.4实现的BUILD_MAP id与2.7完全相同（感谢@ user2357112）。 I dig deeper and it's looks like Python 3 min size of dict is 8 PyDict_MINSIZE_COMBINED const 我深入挖掘它看起来像Python 3分钟大小的dict是8 PyDict_MINSIZE_COMBINED const

PyDict_MINSIZE_COMBINED is the starting size for any new, non-split dict. PyDict_MINSIZE_COMBINED是任何新的非拆分字典的起始大小。 8 allows dicts with no more than 5 active entries; 8允许dicts不超过5个活动条目; experiments suggested this suffices for the majority of dicts (consisting mostly of usually-small dicts created to pass keyword arguments). 实验表明，这足以满足大多数dicts（主要包括为传递关键字参数而创建的通常小的dicts）。 Making this 8, rather than 4 reduces the number of resizes for most dictionaries, without any significant extra memory use. 使这8而不是4减少了大多数字典的调整大小，没有任何显着的额外内存使用。

Look at _PyDict_NewPresized in Python 3.4 在Python 3.4中查看_PyDict_NewPresized

PyObject *
_PyDict_NewPresized(Py_ssize_t minused)
{
    Py_ssize_t newsize;
    PyDictKeysObject *new_keys;
    for (newsize = PyDict_MINSIZE_COMBINED;
         newsize <= minused && newsize > 0;
         newsize <<= 1)
        ;
    new_keys = new_keys_object(newsize);
    if (new_keys == NULL)
        return NULL;
    return new_dict(new_keys, NULL);
}

and in 2.7 在2.7

PyObject *
_PyDict_NewPresized(Py_ssize_t minused)
{
    PyObject *op = PyDict_New();

    if (minused>5 && op != NULL && dictresize((PyDictObject *)op, minused) == -1) {
        Py_DECREF(op);
        return NULL;
    }
    return op;
}

In both cases minused has value 1. 在这两种情况下， minused值为1。

Python 2.7 create a empty dict and Python 3.4 create a 7-element dict. Python 2.7创建一个空字典，Python 3.4创建一个7元素字典。

为什么Python 2.7中的dict定义比Python 3.x更快？

问题描述

So, my question is: 所以，我的问题是：

3 个解决方案

解决方案1
32 2016-05-28 18:33:09

解决方案2
22 2016-05-28 18:55:41

解决方案3
18 已采纳 2016-05-28 19:56:10

EDIT: 编辑：

为什么Python 2.7中的dict定义比Python 3.x更快？

问题描述

So, my question is: 所以，我的问题是：

3 个解决方案

解决方案1 32 2016-05-28 18:33:09

解决方案2 22 2016-05-28 18:55:41

解决方案3 18 已采纳 2016-05-28 19:56:10

EDIT: 编辑：

解决方案1
32 2016-05-28 18:33:09

解决方案2
22 2016-05-28 18:55:41

解决方案3
18 已采纳 2016-05-28 19:56:10