简体   繁体   English

R没有正确做样品

[英]R not doing samples properly

I'm trying to generate a vector v of 4 random integers within a range (1-3) in increasing order, and for that, I figured out this approach: 我试图以递增顺序生成范围(1-3)内的4个随机整数的向量v ,为此,我想出了这种方法:

v<-1:4
v[1]<-sample(1:3,1)
for (i in 2:4) v[i]<-sample(v[i-1]:3,1)

EDIT: Since there has been confusion, I'll clear up that what I want for that vector is to be a mathematical set, so I basically want to get random sets of cardinality 4 formed by using 3 different elements, that can (and obviously must) repeat. 编辑:由于一直存在混淆,我要弄清楚我想要的向量是一个数学集合,所以我基本上想获得通过使用3个不同元素形成的基数为4的随机集合,必须)重复。

But the problem is that, sets like {1,1,1,1} have a single way of appearing, while sets like {1,2,3,3} can appear in 12 different ways (as, in mathematical sets, the order doesn't matter), so I would have 12 times more probability of having one of those to appear. 但是问题在于,像{1,1,1,1}这样的集合只有一种出现方式,而像{1,2,3,3}这样的集合可以以12种不同的方式出现(例如,在数学集合中,顺序无关紧要),因此出现其中一种的可能性会提高12倍。 I'm looking for a way to obtain one of those sets at random, with all of them having the same probability of appearing. 我正在寻找一种随机获得其中一套的方法,所有这些套都具有相同的出现概率。 What I posted should work if it wasn't for that problem 如果不是这个问题,我发布的内容应该可以使用

For some reason though, its not working. 但是由于某种原因,它不起作用。 I've figured out that, when it reaches the top of the range, it messes up and starts thinking that all integers in range are possible again, while in reality theres only one possibility left. 我发现,当到达范围的顶部时,它会混乱,并开始考虑范围内的所有整数都是可能的,而实际上只剩下一种可能性。

ie, as soon as it reaches 3 in my particular problem, it should be executing: 即,在我的特定问题中,当它达到3时,它应该正在执行:

sample(3:3,1)

which should always lead to 3. It, instead, seems to be executing 应该总是导致3。相反,它似乎正在执行

sample(1:3,1)

Is there any workaround on this? 有什么解决方法吗?

This is a little clunky, but you can define an alternate sample function that doesn't default to sample.int when a scalar is passed as the argument: 这有点笨拙,但是您可以定义一个替代样本函数,当将标量作为参数传递时,该函数不会默认为sample.int:

sample.alt = function(x) ifelse(length(x)>1, sample(x, 1), x)

And use that rather than sample. 并使用它而不是样本。

Edit: Glad to help @LMartin. 编辑:很高兴帮助@LMartin。 I had to run for a seminar earlier, so I didn't make this function completely robust. 我必须早些时候参加一个研讨会,所以我没有使此功能完全可靠。 Ideally, this function should have all the same options sample does; 理想情况下,此函数应具有示例相同的所有选项。 unfortunately, ifelse returns a vector the same length as the logical argument passed to it, which is handy with vectors, but not great for this problem: 不幸的是,ifelse返回的向量的长度与传递给它的逻辑参数的长度相同,这对于使用向量很方便,但是对于此问题而言并不理想:

> x = 1:10
> ifelse(length(x)>1, x, 0)
[1] 1

So we just have to do it the long way: 因此,我们只需要做很长一段路:

sample.alt = function(x, size, replace = FALSE, prob = NULL){
  if (length(x) > 1){
    sample(x, size, replace, prob)
  }
  else{
    rep(x, size)
  }
}

Here's a different way of thinking about this problem. 这是解决此问题的另一种方法。 In short, generate all possible valid sequences, remove duplicates, and then sample for the set of unique sequences. 简而言之,生成所有可能的有效序列,删除重复项,然后采样唯一序列集。

> set.seed(1)
> x <- unique(t(apply(expand.grid(1:3,1:3,1:3,1:3),1,sort)),MARGIN=1)
> x
      [,1] [,2] [,3] [,4]
 [1,]    1    1    1    1
 [2,]    1    1    1    2
 [3,]    1    1    1    3
 [4,]    1    1    2    2
 [5,]    1    1    2    3
 [6,]    1    1    3    3
 [7,]    1    2    2    2
 [8,]    1    2    2    3
 [9,]    1    2    3    3
[10,]    1    3    3    3
[11,]    2    2    2    2
[12,]    2    2    2    3
[13,]    2    2    3    3
[14,]    2    3    3    3
[15,]    3    3    3    3
> x[sample(1:nrow(x),1),]
[1] 1 1 2 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM