简体   繁体   English

np.random.permutation,np.random.choice的时间性能

[英]Time performance of np.random.permutation, np.random.choice

I encountered a function with very poor time performance relative to comparable MATLAB code in my pure python graph theory library, so I attempted to profile some operations in this function. 与纯python图形理论库中的类似MATLAB代码相比,我遇到了一个时间性能很差的函数,因此我尝试介绍此函数中的某些操作。

I tracked it to the following result 我跟踪到以下结果

In [27]: timeit.timeit( 'permutation(138)[:4]', setup='from numpy.random import permutation', number=1000000)
Out[27]: 27.659916877746582

Compared this to the performance in MATLAB 将此与MATLAB中的性能进行比较

>> tic; for i=1:1000000; randperm(138,4); end; toc
Elapsed time is 4.593305 seconds.

I was able to considerably improve performance by changing this to np.random.choice instead of np.random.permutation as I had originally wrote. 通过将其更改为np.random.choice而不是我最初编写的np.random.permutation ,我能够显着提高性能。

In [42]: timeit.timeit( 'choice(138, 4)', setup='from numpy.random import choice', number=1000000)
Out[42]: 18.9618501663208

But it still doesn't nearly approach the matlab performance. 但是它仍然没有接近Matlab的性能。

Is there another way of obtaining this behavior in pure python with time performance approaching the MATLAB time performance? 是否有另一种方式可以在纯python中获得这种行为,而时间性能接近MATLAB时间性能?

Based on this solution that showed how one can simulate np.random.choice(..., replace=False) 's behavior with a trick based on argsort / argpartition , you can recreate MATLAB's randperm(138,4) , ie NumPy's np.random.choice(138,4, replace=False) with np.argpartition as : 基于this solution是一个显示如何模拟np.random.choice(..., replace=False)的基于一招行为argsort / argpartition ,您可以重新创建MATLAB的randperm(138,4)即与NumPy的np.random.choice(138,4, replace=False)np.argpartition为:

np.random.rand(138).argpartition(range(4))[:4]

Or with np.argsort like so - 或者像这样使用np.argsort

np.random.rand(138).argsort()[:4]

Let's time these two versions for performance comparison against the MATLAB version. 我们将这两个版本的时间与MATLAB版本进行性能比较。

On MATLAB - 在MATLAB上-

>> tic; for i=1:1000000; randperm(138,4); end; toc
Elapsed time is 1.058177 seconds.

On NumPy with np.argpartition - 在带有np.argpartition NumPy上-

In [361]: timeit.timeit( 'np.random.rand(138).argpartition(range(4))[:4]', setup='import numpy as np', number=1000000)
Out[361]: 9.063489798831142

On NumPy with np.argsort - 在带有np.argsort NumPy上-

In [362]: timeit.timeit( 'np.random.rand(138).argsort()[:4]', setup='import numpy as np', number=1000000)
Out[362]: 5.74625801707225

The original proposed one with NumPy - 最初建议使用NumPy-

In [363]: timeit.timeit( 'choice(138, 4)', setup='from numpy.random import choice', number=1000000)
Out[363]: 6.793723535243771

Seems like one could use np.argsort for a marginal performance improvement. 似乎可以使用np.argsort来提高性能。

How long does this take for you? 这需要多长时间? I estimate 1-2 seconds. 我估计需要1-2秒。

def four():
    k = np.random.randint(138**4)
    a = k % 138
    b = k // 138 % 138
    c = k // 138**2 % 138
    d = k // 138**3 % 138
    return (a, b, c, d) if a != b and a != c and a != d and b != c and b != d and c != d else four()

Update 1: At first I used random.randrange , but np.random.randint made the whole thing about twice as fast. 更新1:最初,我使用random.randrange ,但np.random.randint使整个过程快了两倍。

Update 2: Since NumPy's random function appears to be much faster, I tried this and it's another factor ~1.33 faster: 更新2:由于NumPy的随机函数似乎要快得多,所以我尝试了这一点,这是另一个快〜1.33的因素:

>>> def four():
        a = randint(138)
        b = randint(138)
        c = randint(138)
        d = randint(138)
        return (a, b, c, d) if a != b and a != c and a != d and b != c and b != d and c != d else four()

>>> import timeit
>>> from numpy.random import randint
>>> timeit.timeit(lambda: four(), number=1000000)
2.3742770821572776

That's about 22 times faster than the original: 这比原始速度快22倍:

>>> timeit.timeit('permutation(138)[:4]', setup='from numpy.random import permutation', number=1000000)
51.80568455893672

(string vs lambda doesn't make a noticeable difference) (字符串与lambda区别不大)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM