简体   繁体   中英

Speeding up multinomial random sample in Python/NumPy

I'm generating a vector of draws from a multinomial distribution over a set of probabilities probs , where each draw is the index of the entry in probs that was chosen:

import numpy as np
def sample_mult(K, probs):
    result = np.zeros(num_draws, dtype=np.int32)
    for n in xrange(K):
        draws = np.random.multinomial(1, probs)
        result[n] = np.where(draws == 1)[0][0]
    return result

Can this be sped up? It seems inefficient to call np.random.multinomial over and over again (and np.where might also be slow.)

timeit says The slowest run took 6.72 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 18.9 µs per loop The slowest run took 6.72 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 18.9 µs per loop

You can use the size option with np.random.multinomial to have rows of random samples instead of just one row output with the default size=1 and then use .argmax(1) to simulate np.where()[0][0] behaviour.

Thus, we would have a vectorized solution, like so -

result = (np.random.multinomial(1,probs,size=K)==1).argmax(1)

“选择”的 p= 参数执行此操作(并避免 argmax):

result = np.random.choice(len(probs), K, p=probs)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM