简体   繁体   English

为什么 random.randint() 比 random.getrandbits() 慢得多

[英]Why random.randint() is much slower than random.getrandbits()

A made a test which compares random.randint() and random.getrandbits() in Python. A 做了一个测试,比较了 Python 中的random.randint()random.getrandbits() The result shows that getrandbits is way faster than randint .结果表明getrandbitsrandint

from random import randint, getrandbits, seed
from datetime import datetime

def ts():
    return datetime.now().timestamp()

def diff(st, en):
    print(f'{round((en - st) * 1000, 3)} ms')

seed(111)
N = 6

rn = []
st = ts()
for _ in range(10**N):
    rn.append(randint(0, 1023))
en = ts()
diff(st,en)

rn = []
st = ts()
for _ in range(10**N):
    rn.append(getrandbits(10))
en = ts()
diff(st,en)

The numbers range is exactly the same, because 10 random bits is range from 0 (in case of 0000000000 ) to 1023 (in case of 1111111111 ).数字范围完全相同,因为10随机位的范围从0 (在0000000000的情况下)到1023 (在1111111111的情况下)。

The output of the code: output的代码:

590.509 ms
91.01 ms

As you can see, getrandbits under the same conditions is almost 6.5 times faster.可以看到,相同条件下的getrandbits几乎快了6.5倍。 But why?但为什么? Can somebody explain that?有人可以解释一下吗?

randint uses randrange which uses _randbelow which is _randbelow_with_get_randbits (if it is present) which uses getrandbits . randint使用randrange ,它使用_randbelow ,即_randbelow_with_get_randbits (如果存在),它使用getrandbits However, randrange includes overhead on top of the underlying call to getrandbits because it must check for the start and step arguments.但是, randrange包括对getrandbits的基础调用之上的开销,因为它必须检查开始和步骤 arguments。 Even though there are fast paths for these checks, they are still there.即使有这些检查的快速路径,它们仍然存在。 There are definitely other things contributing to the 6.5 slowdown when using randint , but this answer atleast shows you that randint will always be slower than getrandbits .使用randint时肯定还有其他因素导致 6.5 减速,但这个答案至少向您表明randint总是比getrandbits慢。

getrandbits() is a very short function going to native code almost immediately: getrandbits()是一个非常短的 function 几乎立即进入本机代码:

def getrandbits(self, k):
    """getrandbits(k) -> x.  Generates an int with k random bits."""
    if k < 0:
        raise ValueError('number of bits must be non-negative')
    numbytes = (k + 7) // 8                       # bits / 8 and rounded up
    x = int.from_bytes(_urandom(numbytes), 'big')
    return x >> (numbytes * 8 - k)                # trim excess bits

_urandom is os.urandom , which you can't even find in os.py in the same folder, as it's native code. _urandomos.urandom ,您甚至无法在同一文件夹中的os.py中找到它,因为它是本机代码。 See details here: Where can I find the source code of os.urandom()?在此处查看详细信息: 我在哪里可以找到 os.urandom() 的源代码?

randint() is a return self.randrange(a, b+1) call, where randrange() is a long and versatile function, almost 100 lines of Python code, its "fast track" ends in a return istart + self._randbelow(width) after circa 50 lines of checking and initializing. randint()是一个return self.randrange(a, b+1)调用,其中randrange()是一个长而通用的 function,将近 100 行 Python 代码,它的“快速通道”以return istart + self._randbelow(width)经过大约 50 行检查和初始化。

_randbelow() is actually a choice, depending on the existence of getrandbits() which exists in your case, so probably this one is running: _randbelow()实际上是一种选择,具体取决于您的情况中是否存在getrandbits() ,所以可能this正在运行:

def _randbelow_with_getrandbits(self, n):
    "Return a random int in the range [0,n).  Returns 0 if n==0."

    if not n:
        return 0
    getrandbits = self.getrandbits
    k = n.bit_length()  # don't use (n-1) here because n can be 1
    r = getrandbits(k)  # 0 <= r < 2**k
    while r >= n:
        r = getrandbits(k)
    return r

so randint() ends in the same getrandbits() after doing a lot of other things, and that's pretty much how it's slower than it.所以randint() getrandbits()结束,这几乎就是它比它慢的原因。

The underlying reason is the function that getrandbits calls is written in C , look at the source code at GitHub Python repository The underlying reason is the function that getrandbits calls is written in C , look at the source code at GitHub Python repository

You will find that getrandbits under the hood calls urandom from os module, and if you look at the implementation of this function, you'll find that it is indeed written in C.你会发现getrandbits在后台调用了os模块中的urandom ,如果你看一下这个function的实现,你会发现它确实是用C写的。

os_urandom_impl(PyObject *module, Py_ssize_t size)
/*[clinic end generated code: output=42c5cca9d18068e9 input=4067cdb1b6776c29]*/
{
    PyObject *bytes;
    int result;

    if (size < 0)
        return PyErr_Format(PyExc_ValueError,
                            "negative argument not allowed");
    bytes = PyBytes_FromStringAndSize(NULL, size);
    if (bytes == NULL)
        return NULL;

    result = _PyOS_URandom(PyBytes_AS_STRING(bytes), PyBytes_GET_SIZE(bytes));
    if (result == -1) {
        Py_DECREF(bytes);
        return NULL;
    }
    return bytes;
}

Talking about randint , if you look at the source code for this function, it calls randrange , which further calls _randbelow_with_get_randbits .说到randint ,如果你看一下这个 function 的源代码,它调用randrange ,它进一步调用_randbelow_with_get_randbits

    def randrange(self, start, stop=None, step=_ONE):
        """Choose a random item from range(start, stop[, step]).
        This fixes the problem with randint() which includes the
        endpoint; in Python this is usually not what you want.
        """

        # This code is a bit messy to make it fast for the
        # common case while still doing adequate error checking.
        try:
            istart = _index(start)
        except TypeError:
            istart = int(start)
            if istart != start:
                _warn('randrange() will raise TypeError in the future',
                      DeprecationWarning, 2)
                raise ValueError("non-integer arg 1 for randrange()")
            _warn('non-integer arguments to randrange() have been deprecated '
                  'since Python 3.10 and will be removed in a subsequent '
                  'version',
                  DeprecationWarning, 2)
        if stop is None:
            # We don't check for "step != 1" because it hasn't been
            # type checked and converted to an integer yet.
            if step is not _ONE:
                raise TypeError('Missing a non-None stop argument')
            if istart > 0:
                return self._randbelow(istart)
            raise ValueError("empty range for randrange()")

        # stop argument supplied.
        try:
            istop = _index(stop)
        except TypeError:
            istop = int(stop)
            if istop != stop:
                _warn('randrange() will raise TypeError in the future',
                      DeprecationWarning, 2)
                raise ValueError("non-integer stop for randrange()")
            _warn('non-integer arguments to randrange() have been deprecated '
                  'since Python 3.10 and will be removed in a subsequent '
                  'version',
                  DeprecationWarning, 2)
        width = istop - istart
        try:
            istep = _index(step)
        except TypeError:
            istep = int(step)
            if istep != step:
                _warn('randrange() will raise TypeError in the future',
                      DeprecationWarning, 2)
                raise ValueError("non-integer step for randrange()")
            _warn('non-integer arguments to randrange() have been deprecated '
                  'since Python 3.10 and will be removed in a subsequent '
                  'version',
                  DeprecationWarning, 2)
        # Fast path.
        if istep == 1:
            if width > 0:
                return istart + self._randbelow(width)
            raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))

        # Non-unit step argument supplied.
        if istep > 0:
            n = (width + istep - 1) // istep
        elif istep < 0:
            n = (width + istep + 1) // istep
        else:
            raise ValueError("zero step for randrange()")
        if n <= 0:
            raise ValueError("empty range for randrange()")
        return istart + istep * self._randbelow(n)

    def randint(self, a, b):
        """Return random integer in range [a, b], including both end points.
        """

        return self.randrange(a, b+1)

As you can see, there are so many things going on with Try / Except block in randrange function and as they are written in Python, due to all these overheads and control flow, it's slow.正如你所看到的,在randrange function 和 Python 中的Try / Except块中发生了很多事情,由于所有这些开销和控制流,它很慢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM