随着R中样本数量的增加，使用sample（）无需多次替换

Question

I want to take "random" samples from a vector called data but with increasing size and without replacement. 我想从称为data的向量中获取“随机”样本，但其size增加并且无法替换。

To illustrate my point data looks for example like: 为了说明我的点data ，例如：

data<-c("a","s","d","f","g","h","j","k","l","x","c","v","b","n","m")

What I need is to get different sampling vectors with increasing sampling size (starting with size=2) for example by 2 but without duplicates between the different vectors and store everything into a list so that the result would look something like this: 我需要的是获得不同的采样向量，并以增加的采样大小（从size = 2开始）为例，例如增加2，但在不同的向量之间不重复，并将所有内容存储在列表中，以便结果看起来像这样：

sample_1<-c("s","d")
sample_2<-c("s","d","a","f")
sample_3<-c("s","d","a","f","m","n")
sample_4<-c("s","d","a","f","m","n","l","c")
sample_5<-c("s","d","a","f","m","n","l","c","j","x")
sample_6<-c("s","d","a","f","m","n","l","c","j","x","v","k")
sample_7<-c("s","d","a","f","m","n","l","c","j","x","v","k","g","b")
sample_8<-c("s","d","a","f","m","n","l","c","j","x","v","k","g","b","h")
samples<-list(sample_1,sample_2,sample_3,sample_4,sample_5,sample_6,sample_7,sample_8)

What i have so far is: 到目前为止，我有：

samples<-sapply(seq(from=2, to=length(data), by=2), function(i) sample(data,size=i,replace=F),simplify=F,USE.NAMES=T )

What does not work is to have the increasing sample size but keeping the samples of the previous steps and to have a last list element with all observations. 不可行的是增加样本量，但保留先前步骤的样本，并在所有观察结果中保留最后一个列表元素。 Is something like this possible? 这样的事情可能吗？

Answer 1

I'm not sure whether I understood you correctly, but perhaps you only need to scramble the data once: 我不确定我是否正确理解您，但是也许您只需要对数据进行一次加密：

data = letters
data_random = sample(data)
sapply(seq(from=2, to=length(data), by=2),
       function (x) data_random[1:x],
       simplify = FALSE)

Answer 2

After your comments on other answer I think I get what you want to achieve, so extending my previous code I end up with: 在您对其他答案发表评论之后，我想我就知道了您想要实现的目标，因此扩展我以前的代码，最终得到：

data<-c("a","s","d","f","g","h","j","k","l","x","c","v","b","n","m")
set.seed(123)
nbitems=length(data)/2+length(data)%%2
results=vector("list",nbitems)

results[[1]] <- sample(data,2) # get first sample
for (i in 2:nbitems) { # Loop for each result
  samplesavail <- data[!data %in% results[[i-1]]] # Reduce the samples available
  results[[i]] <- c(results[[i-1]], sample( samplesavail, min( length(samplesavail), 2) ) ) # concatenate a new sample, size depends on step and remaining samples available.
}

Hope this match your intended use: 希望这符合您的预期用途：

> results
[[1]]
[1] "n" "f"

[[2]]
[1] "n" "f" "a" "g"

[[3]]
[1] "n" "f" "a" "g" "m" "v"

[[4]]
[1] "n" "f" "a" "g" "m" "v" "x" "l"

[[5]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j"

[[6]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j" "k" "h"

[[7]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j" "k" "h" "d" "s"

[[8]]
 [1] "n" "f" "a" "g" "m" "v" "x" "l" "b" "j" "k" "h" "d" "s" "c"

Previous approach: 以前的方法：

If I understood you well (but far unsure): 如果我对您的理解很好（但不确定）：

data<-c("a","s","d","f","g","h","j","k","l","x","c","v","b","n","m")
set.seed(123) # fix the seed for repro of answer, remove in real case
nbitems=length(data)/2+length(data)%%2 # Get how much entries we should have when stepping by 2
results=vector("list",nbitems) # preallocate the list (as we'll start by end)
results[[nbitems]] = sample(data,length(data)) # sample the datas
for (i in nbitems:2) {
  results[[i-1]] <- results[[i]][1:(length(results[[i]]) - 2)] # for each iteration, take down the 2 last entries.
}

This give a single entry as first result. 这给出一个条目作为第一结果。

Just noticed this is the same idea as @sbstn answer but with a more complicated backward approach, posting in case it can have some value. 刚刚注意到，这与@sbstn答案是相同的主意，但采用了更为复杂的后向方法，以防万一它可以具有一定的价值。

随着R中样本数量的增加，使用sample（）无需多次替换

问题描述

2 个解决方案

解决方案1
5 已采纳 2016-07-26 14:19:48

解决方案2
3 2016-07-26 14:40:48

随着R中样本数量的增加，使用sample（）无需多次替换

问题描述

2 个解决方案

解决方案1 5 已采纳 2016-07-26 14:19:48

解决方案2 3 2016-07-26 14:40:48

解决方案1
5 已采纳 2016-07-26 14:19:48

解决方案2
3 2016-07-26 14:40:48