[英]Multiple Random Sampling in R
I currently have a data frame called liquidation where I want to run 30 random samples of 1000 observations each from it, designate which account came from which sample and then combine it into a new data frame with all 30 samples combined: 我目前有一个称为清算的数据框,我要在其中运行30个随机样本,每个样本有1000个观测值,指定哪个帐户来自哪个样本,然后将其合并到一个新的数据帧中,并合并所有30个样本:
Here is how I did it manually while using the dplyr package for random sampling but want to simplify it for repeatability: 这是我在使用dplyr软件包进行随机采样时手动执行的操作,但是为了简化可重复性,我想对其进行简化:
Sample_1 <- liquidation %>%
sample_n(1000)
Sample_1$Obs <- 1
Sample_2 <- liquidation %>%
sample_n(1000)
Sample_2$Obs <- 2
Sample_3 <- liquidation %>%
sample_n(1000)
Sample_3$Obs <- 3
....
Sample_30 <- liquidation %>%
sample_n(1000)
Sample_30$Obs <- 30
Then I combine it all into a single combined data frame: 然后,将所有内容合并为一个合并的数据帧:
Combined <- rbind(Sample_1, Sample_2, Sample_3, Sample_4, Sample_5, Sample_6, Sample_7, Sample_8, Sample_9, Sample_10,
Sample_11, Sample_12, Sample_13, Sample_14, Sample_15, Sample_16, Sample_17, Sample_18, Sample_19,
Sample_20, Sample_21, Sample_22, Sample_23, Sample_24, Sample_25, Sample_26, Sample_27, Sample_28,
Sample_29, Sample_30)
str(Combined)
'data.frame': 30000 obs. of 31 variables:
Here's an example using mtcars
(selecting 5 rows at random, 10 times) 这是使用
mtcars
的示例(随机选择5行,共10次)
Combined <- bind_rows(replicate(10, mtcars %>% sample_n(5), simplify=F), .id="Obs")
We use the base function replicate()
to repeat the sampling multiple times. 我们使用基本函数
replicate()
重复多次采样。 Then we use dplyr
's bind_rows()
to merge the samples and keep track of the which sample they came from. 然后,我们使用
dplyr
的bind_rows()
合并样本并跟踪它们来自哪个样本。
You should just be able to wrap this up into a function (assuming Sample_20, etc are temporary and you don't need them later on) 您应该可以将其包装为一个函数(假设Sample_20等是临时的,以后不再需要它们了)
sampling <- function(x, nSamples = 30, nRows = 1000) {
do.call('rbind', lapply(seq_along(1:nSamples), function(n) {
x %>% sample_n(nRows) %>% mutate(Obs=n)
}))
}
Then can be run with: 然后可以运行:
combined <- sampling(liquidation)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.