[英]How to select randomly and fairly data in matlab
How can I select randomly and fairly some data from a dataset in matlab? 如何从matlab中的数据集中随机选择一些数据呢?
When we use the randperm function to select data, they are random and fair? 当我们使用randperm函数来选择数据时,它们是随机的还是公平的?
As you already suggested, selecting k
uniformly random chosen rows out of n
can be done with randperm
, assuming you don't want duplication. 正如您已经建议的randperm
,假设您不需要重复,可以使用randperm
从n
选择k
均匀随机选择的行。
Example: 例:
dataSet = rand(1000,4);
idx = randperm(size(dataSet,1),10)
dataSet(idx,:)
If you have the Statistics Toolbox, you can use randsample
: 如果您有统计工具箱,则可以使用randsample
:
sample = randsample(data,k);
takes k
values sampled uniformly at random, without replacement, from the values in the vector data
. 采用从矢量data
的值随机均匀采样的k
值,而无需替换。 See above link for other options. 有关其他选项,请参见上面的链
Equivalent code with randperm
: 与randperm
等价代码:
ind = randperm(numel(data));
sample = data(ind(1:k));
Yes, either of these approaches gives random samples, and yes, they are fair . 是的,这些方法中的任何一种都提供随机样本,是的,它们是公平的 。 I assume that by "fair" you mean "uniform": each entry of data
is picked with the same probability. 我假设“公平”是指“统一”:每个data
条目都以相同的概率被选中。
anything that uses uniform distribution is "fair". 任何使用统一分配的东西都是“公平的”。 because the output is supposed to be distributed randomly in an specific range. 因为输出应该在特定范围内随机分布。 for example, rand function in matlab. 例如,matlab中的rand函数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.