I'm generating a vector of draws from a multinomial distribution over a set of probabilities probs
, where each draw is the index of the entry in probs
that was chosen:
import numpy as np
def sample_mult(K, probs):
result = np.zeros(num_draws, dtype=np.int32)
for n in xrange(K):
draws = np.random.multinomial(1, probs)
result[n] = np.where(draws == 1)[0][0]
return result
Can this be sped up? It seems inefficient to call np.random.multinomial
over and over again (and np.where
might also be slow.)
timeit
says The slowest run took 6.72 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 18.9 µs per loop
The slowest run took 6.72 times longer than the fastest. This could mean that an intermediate result is being cached 100000 loops, best of 3: 18.9 µs per loop
You can use the size
option with np.random.multinomial
to have rows of random samples instead of just one row output with the default size=1
and then use .argmax(1)
to simulate np.where()[0][0]
behaviour.
Thus, we would have a vectorized solution, like so -
result = (np.random.multinomial(1,probs,size=K)==1).argmax(1)
“选择”的 p= 参数执行此操作(并避免 argmax):
result = np.random.choice(len(probs), K, p=probs)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.