重复随机采样行

Question

I have a dataframe containing 2 columns: column 1 are ID's and column 2 are values associated with each ID (totalling 59 different rows). 我有一个包含2列的数据框：第1列是ID，第2列是与每个ID相关联的值（总共59个不同的行）。

Example: 例：

     [ID] [value] 
[1]   a   164  
[2]   b   167  
[3]   c   120  
[4]   d   117  
[5]   e   106

I am assuming that the only way I can randomly sample from column 1 and keep the associated value in column 2, is by sampling rows. 我假设我可以从第1列中随机抽样并将关联值保留在第2列中的唯一方法是对行进行抽样。 I need to randomly sample 50 x 1 row, 50 x 2 rows, 50 x 3 rows, 50 x 4 rows etc. up to 59 rows. 我需要随机采样50 x 1行，50 x 2行，50 x 3行，50 x 4行等，最多59行。 Ideally, with each sample set output as a dataframe. 理想情况下，每个样本集输出都作为数据框。 So, I would end up with 59 sets of randomly sampled data. 因此，我最终将获得59组随机采样的数据。 Essentially this is the same as creating random subsets of data. 本质上，这与创建数据的随机子集相同。

I have this code which produces a df of 10 randomly sampled rows for example. 我有这段代码，例如，它会产生10个随机采样行的df。

sample_df<-df[sample.int(nrow(df),size=10,replace=TRUE),]

The question is how can I adjust this code so that it produces 50 times 10 random rows? 问题是如何调整此代码，以使其产生10次随机行的50倍？ Should I be using a loop to generate all of the random samples that I need? 我应该使用循环来生成所需的所有随机样本吗？

Answer 1

您可以使用lapply ，这将返回数据帧列表：

lapply(1:59, function(x) df[sample(nrow(df), size = x, replace = TRUE),])

重复随机采样行

问题描述

1 个解决方案

解决方案1
0 2018-06-04 15:07:44

重复随机采样行

问题描述

1 个解决方案

解决方案1 0 2018-06-04 15:07:44

解决方案1
0 2018-06-04 15:07:44