使用python生成随机单词

Question

I have a list of words 我有一个单词列表

count=100    
list = ['apple','orange','mango']

for the count above using random function is it possible to select 40% of the time apple, 30% of the time orange and 30% of the time mango? 对于上面使用随机函数的计数，有可能选择40％的苹果时间，30％的橙色时间和30％的时间芒果？

for ex: 对于前：

for the count=100, 40 times apple, 30 times orange and 30 times mango.

this select has to happen randomly 这种选择必须随机发生

Answer 1

Based on an answer to the question about generating discrete random variables with specified weights , you can use numpy.random.choice to get 20 times faster code than with random.choice : 根据关于生成具有指定权重的离散随机变量的问题的答案，您可以使用numpy.random.choice获得比使用random.choice快20倍的代码：

from numpy.random import choice

sample = choice(['apple','orange','mango'], p=[0.4, 0.3, 0.3], size=1000000)

from collections import Counter
print(Counter(sample))

Outputs: 输出：

Counter({'apple': 399778, 'orange': 300317, 'mango': 299905})

Not to mention that it is actually easier than "to build a list in the required proportions and then shuffle it". 更不用说它实际上比“以所需比例建立一个列表然后将其洗牌”更容易。

Also, shuffle would always produce exactly 40% apples, 30% orange and 30% mango, which is not the same as saying "produce a sample of million fruits according to a discrete probability distribution". 此外，洗牌总是产生完全相同的40％苹果，30％，橙色和30％的芒果，这是不一样说“产生根据离散概率分布的百万水果样品”一样。 The latter is what both choice solutions do (and the bisect too). 后者是两种choice解决方案所做的事情（也是bisect ）。 As can be seen above, there is about 40% apples, etc., when using numpy . 从上面可以看出，当使用numpy时， 大约有 40％的苹果等。

Answer 2

The easiest way is to build a list in the required proportions and then shuffle it. 最简单的方法是以所需的比例构建一个列表，然后将其洗牌。

>>> import random
>>> result = ['apple'] * 40 + ['orange'] * 30 + ['mango'] * 30
>>> random.shuffle(result)

Edit for the new requirement that the count is really 1,000,000: 编辑计数实际为1,000,000的新要求：

>>> count = 1000000
>>> pool = ['apple'] * 4 + ['orange'] * 3 + ['mango'] * 3
>>> for i in xrange(count):
        print random.choice(pool)

A slower but more general alternative approach is to bisect a cumulative probability distribution : 较慢但更通用的替代方法是将累积概率分布平分：

>>> import bisect
>>> choices = ['apple', 'orange', 'mango']
>>> cum_prob_dist = [0.4, 0.7]
>>> for i in xrange(count):
        print choices[bisect.bisect(cum_prob_dist, random.random())]

使用python生成随机单词

问题描述

2 个解决方案

解决方案1
5 已采纳 2016-05-28 08:59:14

解决方案2
4 2016-05-28 04:20:29

使用python生成随机单词

问题描述

2 个解决方案

解决方案1 5 已采纳 2016-05-28 08:59:14

解决方案2 4 2016-05-28 04:20:29

解决方案1
5 已采纳 2016-05-28 08:59:14

解决方案2
4 2016-05-28 04:20:29