[英]Uniform distribution with random on iris
This is the code I have to randomly select 30 data points uniformly. 这是我必须统一随机选择30个数据点的代码。 The part that confuses me is why we are checking
if random.random() <= p
? 令我困惑的部分是我们为什么要检查
if random.random() <= p
? Can anyone explain? 谁能解释一下?
from sklearn import datasets
import random
iris = datasets.load_iris()
d = iris.data
# sample 30 points uniform randomly from 150 points dataset
n = 150
m = 30
p = m/n
lst = []
for i in range(0, n):
if random.random() <= p:
lst.append(d[i,:])
So p
represents the probability of an element being selected. 因此
p
表示元素被选中的概率。
As there are 150
total elements, and 30
elements need selecting, the probability of selecting one element is 30/150
. 由于总共有
150
个元素,需要选择30
元素,因此选择一个元素的概率为30/150
。 This is set to p
. 这设置为
p
。
Each element is then iterated over and if the result of random.random()
(between 0
and 1
) is greater than p
, than that element is selected (I assume this; I do not fully know your dataset). 然后迭代每个元素,如果
random.random()
(在0
和1
之间random.random()
的结果大于p
,则选择该元素(我假设这个;我不完全知道你的数据集)。
On average, this should give about 30
elements. 平均而言,这应该给出大约
30
元素。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.