[英]sampling integers uniformly efficiently in python using numpy/scipy
I have a problem where depending on the result of a random coin flip, I have to sample a random starting position from a string. 我有一个问题,根据随机硬币翻转的结果,我必须从字符串中采样随机的起始位置。 If the sampling of this random position is uniform over the string, I thought of two approaches to do it: one using multinomial from numpy.random, the other using the simple randint function of Python standard lib.
如果在字符串上对该随机位置的采样是均匀的,那么我想到了两种方法:一种使用numpy.random中的多项式,另一种使用Python标准lib的简单randint函数。 I tested this as follows:
我对此进行了如下测试:
from numpy import *
from numpy.random import multinomial
from random import randint
import time
def use_multinomial(length, num_points):
probs = ones(length)/float(length)
for n in range(num_points):
result = multinomial(1, probs)
def use_rand(length, num_points):
for n in range(num_points):
rand(1, length)
def main():
length = 1700
num_points = 50000
t1 = time.time()
use_multinomial(length, num_points)
t2 = time.time()
print "Multinomial took: %s seconds" %(t2 - t1)
t1 = time.time()
use_rand(length, num_points)
t2 = time.time()
print "Rand took: %s seconds" %(t2 - t1)
if __name__ == '__main__':
main()
The output is: 输出为:
Multinomial took: 6.58072400093 seconds Rand took: 2.35189199448 seconds 多项式花费:6.58072400093秒兰德花费:2.35189199448秒
it seems like randint is faster, but it still seems very slow to me. 看起来randint更快,但对我来说却仍然很慢。 Is there a vectorized way to get this to be much faster, using numpy or scipy?
是否存在使用numpy或scipy的矢量化方法来使其更快?
thanks. 谢谢。
I changed your code to actually return values (and used randint
instead of rand
- isn't that what you meant?) like this... 我改变你的代码实际上返回值(和使用
randint
代替rand
-是不是你的意思?)这样的...
def use_multinomial(length, num_points):
probs = ones(length)/float(length)
return multinomial(1, probs, num_points)
def use_rand(length, num_points):
return [randint(1,length) for _ in range(num_points)]
Then I tried my own version, using numpy.random.randint
to generate a numpy array of random points on the string: 然后,我尝试了自己的版本,使用
numpy.random.randint
在字符串上生成了一个随机点的numpy数组:
def use_np_randint(length, num_point):
return nprandint(1, length, num_points)
The results: 结果:
Multinomial took: 13.6279997826 seconds
Rand took: 0.185000181198 seconds
NP randint took: 0.00100016593933 seconds
Multinomial is obviously really slow comparitively, but is that even what you want? 显然,多项式确实比较慢,但这是否就是您想要的? I thought you said you wanted a uniform distribution?
我以为你说过要统一分配? Using numpy's randint is clearly the fastest of the bunch.
使用numpy的randint显然是最快的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.