Speeding up multinomial random sample in Python/NumPy

Question

I'm generating a vector of draws from a multinomial distribution over a set of probabilities probs , where each draw is the index of the entry in probs that was chosen:

import numpy as np
def sample_mult(K, probs):
    result = np.zeros(num_draws, dtype=np.int32)
    for n in xrange(K):
        draws = np.random.multinomial(1, probs)
        result[n] = np.where(draws == 1)[0][0]
    return result

Can this be sped up? It seems inefficient to call np.random.multinomial over and over again (and np.where might also be slow.)

timeit says The slowest run took 6.72 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 18.9 µs per loop The slowest run took 6.72 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 18.9 µs per loop

Answer 1

You can use the size option with np.random.multinomial to have rows of random samples instead of just one row output with the default size=1 and then use .argmax(1) to simulate np.where()[0][0] behaviour.

Thus, we would have a vectorized solution, like so -

result = (np.random.multinomial(1,probs,size=K)==1).argmax(1)

Answer 2

“选择”的 p= 参数执行此操作（并避免 argmax）：

result = np.random.choice(len(probs), K, p=probs)

Speeding up multinomial random sample in Python/NumPy

Question

2 answers

solution1
6 ACCPTED 2016-02-01 15:10:12

solution2
1 2019-11-12 09:08:40

Speeding up multinomial random sample in Python/NumPy

Question

2 answers

solution1 6 ACCPTED 2016-02-01 15:10:12

solution2 1 2019-11-12 09:08:40

solution1
6 ACCPTED 2016-02-01 15:10:12

solution2
1 2019-11-12 09:08:40