[英]Why random.randint() is much slower than random.getrandbits()
A made a test which compares random.randint()
and random.getrandbits()
in Python. A 做了一个测试,比较了 Python 中的
random.randint()
和random.getrandbits()
。 The result shows that getrandbits
is way faster than randint
.结果表明
getrandbits
比randint
。
from random import randint, getrandbits, seed
from datetime import datetime
def ts():
return datetime.now().timestamp()
def diff(st, en):
print(f'{round((en - st) * 1000, 3)} ms')
seed(111)
N = 6
rn = []
st = ts()
for _ in range(10**N):
rn.append(randint(0, 1023))
en = ts()
diff(st,en)
rn = []
st = ts()
for _ in range(10**N):
rn.append(getrandbits(10))
en = ts()
diff(st,en)
The numbers range is exactly the same, because 10
random bits is range from 0
(in case of 0000000000
) to 1023
(in case of 1111111111
).数字范围完全相同,因为
10
随机位的范围从0
(在0000000000
的情况下)到1023
(在1111111111
的情况下)。
The output of the code: output的代码:
590.509 ms
91.01 ms
As you can see, getrandbits
under the same conditions is almost 6.5
times faster.可以看到,相同条件下的
getrandbits
几乎快了6.5
倍。 But why?但为什么? Can somebody explain that?
有人可以解释一下吗?
randint
uses randrange
which uses _randbelow
which is _randbelow_with_get_randbits
(if it is present) which uses getrandbits
. randint
使用randrange
,它使用_randbelow
,即_randbelow_with_get_randbits
(如果存在),它使用getrandbits
。 However, randrange
includes overhead on top of the underlying call to getrandbits
because it must check for the start and step arguments.但是,
randrange
包括对getrandbits
的基础调用之上的开销,因为它必须检查开始和步骤 arguments。 Even though there are fast paths for these checks, they are still there.即使有这些检查的快速路径,它们仍然存在。 There are definitely other things contributing to the 6.5 slowdown when using
randint
, but this answer atleast shows you that randint
will always be slower than getrandbits
.使用
randint
时肯定还有其他因素导致 6.5 减速,但这个答案至少向您表明randint
总是比getrandbits
慢。
getrandbits()
is a very short function going to native code almost immediately: getrandbits()
是一个非常短的 function 几乎立即进入本机代码:
def getrandbits(self, k):
"""getrandbits(k) -> x. Generates an int with k random bits."""
if k < 0:
raise ValueError('number of bits must be non-negative')
numbytes = (k + 7) // 8 # bits / 8 and rounded up
x = int.from_bytes(_urandom(numbytes), 'big')
return x >> (numbytes * 8 - k) # trim excess bits
_urandom
is os.urandom
, which you can't even find in os.py
in the same folder, as it's native code. _urandom
是os.urandom
,您甚至无法在同一文件夹中的os.py
中找到它,因为它是本机代码。 See details here: Where can I find the source code of os.urandom()?在此处查看详细信息: 我在哪里可以找到 os.urandom() 的源代码?
randint()
is a return self.randrange(a, b+1)
call, where randrange()
is a long and versatile function, almost 100 lines of Python code, its "fast track" ends in a return istart + self._randbelow(width)
after circa 50 lines of checking and initializing. randint()
是一个return self.randrange(a, b+1)
调用,其中randrange()
是一个长而通用的 function,将近 100 行 Python 代码,它的“快速通道”以return istart + self._randbelow(width)
经过大约 50 行检查和初始化。
_randbelow()
is actually a choice, depending on the existence of getrandbits()
which exists in your case, so probably this
one is running: _randbelow()
实际上是一种选择,具体取决于您的情况中是否存在getrandbits()
,所以可能this
正在运行:
def _randbelow_with_getrandbits(self, n):
"Return a random int in the range [0,n). Returns 0 if n==0."
if not n:
return 0
getrandbits = self.getrandbits
k = n.bit_length() # don't use (n-1) here because n can be 1
r = getrandbits(k) # 0 <= r < 2**k
while r >= n:
r = getrandbits(k)
return r
so randint()
ends in the same getrandbits()
after doing a lot of other things, and that's pretty much how it's slower than it.所以
randint()
getrandbits()
结束,这几乎就是它比它慢的原因。
The underlying reason is the function that getrandbits
calls is written in C
, look at the source code at GitHub Python repository The underlying reason is the function that
getrandbits
calls is written in C
, look at the source code at GitHub Python repository
You will find that getrandbits
under the hood calls urandom
from os
module, and if you look at the implementation of this function, you'll find that it is indeed written in C.你会发现
getrandbits
在后台调用了os
模块中的urandom
,如果你看一下这个function的实现,你会发现它确实是用C写的。
os_urandom_impl(PyObject *module, Py_ssize_t size)
/*[clinic end generated code: output=42c5cca9d18068e9 input=4067cdb1b6776c29]*/
{
PyObject *bytes;
int result;
if (size < 0)
return PyErr_Format(PyExc_ValueError,
"negative argument not allowed");
bytes = PyBytes_FromStringAndSize(NULL, size);
if (bytes == NULL)
return NULL;
result = _PyOS_URandom(PyBytes_AS_STRING(bytes), PyBytes_GET_SIZE(bytes));
if (result == -1) {
Py_DECREF(bytes);
return NULL;
}
return bytes;
}
Talking about randint
, if you look at the source code for this function, it calls randrange
, which further calls _randbelow_with_get_randbits
.说到
randint
,如果你看一下这个 function 的源代码,它调用randrange
,它进一步调用_randbelow_with_get_randbits
。
def randrange(self, start, stop=None, step=_ONE):
"""Choose a random item from range(start, stop[, step]).
This fixes the problem with randint() which includes the
endpoint; in Python this is usually not what you want.
"""
# This code is a bit messy to make it fast for the
# common case while still doing adequate error checking.
try:
istart = _index(start)
except TypeError:
istart = int(start)
if istart != start:
_warn('randrange() will raise TypeError in the future',
DeprecationWarning, 2)
raise ValueError("non-integer arg 1 for randrange()")
_warn('non-integer arguments to randrange() have been deprecated '
'since Python 3.10 and will be removed in a subsequent '
'version',
DeprecationWarning, 2)
if stop is None:
# We don't check for "step != 1" because it hasn't been
# type checked and converted to an integer yet.
if step is not _ONE:
raise TypeError('Missing a non-None stop argument')
if istart > 0:
return self._randbelow(istart)
raise ValueError("empty range for randrange()")
# stop argument supplied.
try:
istop = _index(stop)
except TypeError:
istop = int(stop)
if istop != stop:
_warn('randrange() will raise TypeError in the future',
DeprecationWarning, 2)
raise ValueError("non-integer stop for randrange()")
_warn('non-integer arguments to randrange() have been deprecated '
'since Python 3.10 and will be removed in a subsequent '
'version',
DeprecationWarning, 2)
width = istop - istart
try:
istep = _index(step)
except TypeError:
istep = int(step)
if istep != step:
_warn('randrange() will raise TypeError in the future',
DeprecationWarning, 2)
raise ValueError("non-integer step for randrange()")
_warn('non-integer arguments to randrange() have been deprecated '
'since Python 3.10 and will be removed in a subsequent '
'version',
DeprecationWarning, 2)
# Fast path.
if istep == 1:
if width > 0:
return istart + self._randbelow(width)
raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
# Non-unit step argument supplied.
if istep > 0:
n = (width + istep - 1) // istep
elif istep < 0:
n = (width + istep + 1) // istep
else:
raise ValueError("zero step for randrange()")
if n <= 0:
raise ValueError("empty range for randrange()")
return istart + istep * self._randbelow(n)
def randint(self, a, b):
"""Return random integer in range [a, b], including both end points.
"""
return self.randrange(a, b+1)
As you can see, there are so many things going on with Try
/ Except
block in randrange
function and as they are written in Python, due to all these overheads and control flow, it's slow.正如你所看到的,在
randrange
function 和 Python 中的Try
/ Except
块中发生了很多事情,由于所有这些开销和控制流,它很慢。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.