简体   繁体   English

在 R 中使用不同数字在组内进行抽样

[英]Sampling within group with varied numbers in R

Suppose I have a data frame df :假设我有一个数据框df

set.seed(123)
n1  <- 5
n2  <- 8
DVm <- rnorm(n1, 180, 10)
DVf <- rnorm(n2, 175, 6)
df <- data.frame(DV=c(DVm, DVf),
                   IV=factor(rep(c("m", "f"), c(n1, n2))))
df
         DV IV
1  174.3952  m
2  177.6982  m
3  195.5871  m
4  180.7051  m
5  181.2929  m
6  185.2904  f
7  177.7655  f
8  167.4096  f
9  170.8789  f
10 172.3260  f
11 182.3445  f
12 177.1589  f
13 177.4046  f

What I wanted is to create a new data frame by sampling n1 new DV with replacement for IV=="m" and n2 new DV with replacement for IV=="f" so that the new data frame will have same dimensions and has sampled within each group of m and f.我想要的是通过对n1 new DV 替换 IV=="m" 和n2 new DV 替换 IV=="f" 来创建一个新数据帧,这样新数据帧将具有相同的维度并已采样在每组 m 和 f 中。 Is there a single function for it?它有一个单一的功能吗?

We can use slice_sample within group_modify我们可以在group_modify中使用slice_sample

library(dplyr)
df %>% 
  group_by(IV) %>%
  group_modify(~ .x %>%
     slice_sample( n= nrow(.), replace = TRUE)) %>%
  ungroup

-output -输出

# A tibble: 13 × 2
   IV       DV
   <fct> <dbl>
 1 f      177.
 2 f      182.
 3 f      185.
 4 f      178.
 5 f      177.
 6 f      171.
 7 f      172.
 8 f      167.
 9 m      181.
10 m      178.
11 m      174.
12 m      196.
13 m      181.

Another simpler solution is to use slice_sample() with prop=1 after groupby另一个更简单的解决方案是在 groupby 之后使用带有 prop=1 的 slice_sample()

df %>% 
  group_by(IV) %>%
  slice_sample(prop=1, replace=TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM