简体   繁体   English

在 R 中按组抽取多个大小为 n 的 k 个样本

[英]Take multiple k samples of varying size n by groups in R

I have a dataset that looks like this:我有一个看起来像这样的数据集:

group=rep(1:4,each=100)
values=round(runif(400,25,350),0)

data<-data.frame(values,group)

Each group is comprised by 100 observations (values).每组由 100 个观察值(值)组成。

For each group, I would take 20 random samples without replacement and varying sampling size starting from 10 and increasing by 5 up to 95.对于每组,我将随机抽取 20 个样本而无需替换,样本大小从 10 开始,从 5 增加到 95。

Thus for each group I want 20 samples with size=10, 20 samples with size=15....20 samples with size=95.因此,对于每组,我想要 20 个大小为 10 的样本,20 个样本大小为 15....20 个样本大小为 95。

Any idea on how to do it using some tidyverse solution?关于如何使用一些 tidyverse 解决方案做到这一点的任何想法?

At the moment I did this:目前我这样做了:

data %>% 
  group_by(group) %>% 
  nest() %>% 
  mutate(v=map(data,~rep_sample_n(.,size=10,replace=FALSE,reps=20))) %>% 
  unnest(v)

It seems correctly replicate 20 times a sample with size=10, but still I need to change the size...它似乎正确地复制了大小为 10 的样本 20 次,但我仍然需要更改大小...

Thanks.谢谢。

You could create a sequence of sample sizes, wrap your group_by/nest/etc dude into a For loop, then add each new sample to a list.您可以创建一系列样本大小,将 group_by/nest/etc 包到 For 循环中,然后将每个新样本添加到列表中。

Notice how the size argument in ~rep_sample_n is now sizes[i] rather than a fixed number.注意~rep_sample_n的 size 参数现在是 size sizes[i]而不是固定数字。

sizes <- seq(10,95,by=5)

sample_list <- list()

for (i in 1:length(sizes)){

  new_data <- data %>% 
    group_by(group) %>% 
    nest() %>% 
    mutate(v=map(data,~rep_sample_n(.,size=sizes[i],replace=FALSE,reps=20))) %>% 
    unnest(v)
  
  sample_list[i] <- new_data

}

I am suggesting a for loop instead of lapply() , as it makes more sense to me and this application doesn't take much time anyway.我建议使用for循环而不是lapply() ,因为它对我来说更有意义,而且这个应用程序不会花费太多时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 基于r中的两组随机抽样 - Take random samples based on two groups in r R-为多个样本组创建大小为n的随机子样本 - R - Create random subsamples of size n for multiple sample groups 生成大小为 n 的样本并在 R 中应用函数? - Generate samples of size n and apply function in R? 如何从R中大小增加的数据集中获取随机样本? - How to take random samples from data set with increasing size in R? 在 R 中创建 m 个大小为 n 正态分布的样本 - Create m samples with size n Normal Distribution in R 所有可能的方法将n分为k组 - R. - All possible ways to split n over k groups - R 是否有一个R函数来获取n个对象的排列数取k p(n,k)? - Is there an R function to get the number of permutations of n objects take k P(n,k)? R:从长度为 n 的向量中抽取 2 个长度为 n 的随机非重叠样本(对于相同的索引) - R: take 2 random non-overlapping samples (for same indexes) of length n out of vector of length n as well 如何从R中大小为N的数据帧中获取大小为n的所有可能的子样本? - How to obtain all possible sub-samples of size n from a dataframe of size N in R? 多项式问题的 R 实现:n 次头在 k 次抛掷中的概率,每次抛掷的概率不同 - R implementation of a Multinominal Problem: Probability of n-times head in k throws with varying probabilities per throw
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM