虹膜随机均匀分布

Question

This is the code I have to randomly select 30 data points uniformly. 这是我必须统一随机选择30个数据点的代码。 The part that confuses me is why we are checking if random.random() <= p ? 令我困惑的部分是我们为什么要检查if random.random() <= p ？ Can anyone explain? 谁能解释一下？

from sklearn import datasets
import random
iris = datasets.load_iris()
d = iris.data

# sample 30 points uniform randomly from 150 points dataset
n = 150
m = 30
p = m/n

lst = []
for i in range(0, n):
    if random.random() <= p:
        lst.append(d[i,:])

Answer 1

So p represents the probability of an element being selected. 因此p表示元素被选中的概率。

As there are 150 total elements, and 30 elements need selecting, the probability of selecting one element is 30/150 . 由于总共有150个元素，需要选择30元素，因此选择一个元素的概率为30/150 。 This is set to p . 这设置为p 。

Each element is then iterated over and if the result of random.random() (between 0 and 1 ) is greater than p , than that element is selected (I assume this; I do not fully know your dataset). 然后迭代每个元素，如果random.random() （在0和1之间random.random()的结果大于p ，则选择该元素（我假设这个;我不完全知道你的数据集）。

On average, this should give about 30 elements. 平均而言，这应该给出大约30元素。

虹膜随机均匀分布

问题描述

1 个解决方案

解决方案1
3 已采纳 2019-01-07 20:30:03

虹膜随机均匀分布

问题描述

1 个解决方案

解决方案1 3 已采纳 2019-01-07 20:30:03

解决方案1
3 已采纳 2019-01-07 20:30:03