简体   繁体   中英

R not doing samples properly

I'm trying to generate a vector v of 4 random integers within a range (1-3) in increasing order, and for that, I figured out this approach:

v<-1:4
v[1]<-sample(1:3,1)
for (i in 2:4) v[i]<-sample(v[i-1]:3,1)

EDIT: Since there has been confusion, I'll clear up that what I want for that vector is to be a mathematical set, so I basically want to get random sets of cardinality 4 formed by using 3 different elements, that can (and obviously must) repeat.

But the problem is that, sets like {1,1,1,1} have a single way of appearing, while sets like {1,2,3,3} can appear in 12 different ways (as, in mathematical sets, the order doesn't matter), so I would have 12 times more probability of having one of those to appear. I'm looking for a way to obtain one of those sets at random, with all of them having the same probability of appearing. What I posted should work if it wasn't for that problem

For some reason though, its not working. I've figured out that, when it reaches the top of the range, it messes up and starts thinking that all integers in range are possible again, while in reality theres only one possibility left.

ie, as soon as it reaches 3 in my particular problem, it should be executing:

sample(3:3,1)

which should always lead to 3. It, instead, seems to be executing

sample(1:3,1)

Is there any workaround on this?

This is a little clunky, but you can define an alternate sample function that doesn't default to sample.int when a scalar is passed as the argument:

sample.alt = function(x) ifelse(length(x)>1, sample(x, 1), x)

And use that rather than sample.

Edit: Glad to help @LMartin. I had to run for a seminar earlier, so I didn't make this function completely robust. Ideally, this function should have all the same options sample does; unfortunately, ifelse returns a vector the same length as the logical argument passed to it, which is handy with vectors, but not great for this problem:

> x = 1:10
> ifelse(length(x)>1, x, 0)
[1] 1

So we just have to do it the long way:

sample.alt = function(x, size, replace = FALSE, prob = NULL){
  if (length(x) > 1){
    sample(x, size, replace, prob)
  }
  else{
    rep(x, size)
  }
}

Here's a different way of thinking about this problem. In short, generate all possible valid sequences, remove duplicates, and then sample for the set of unique sequences.

> set.seed(1)
> x <- unique(t(apply(expand.grid(1:3,1:3,1:3,1:3),1,sort)),MARGIN=1)
> x
      [,1] [,2] [,3] [,4]
 [1,]    1    1    1    1
 [2,]    1    1    1    2
 [3,]    1    1    1    3
 [4,]    1    1    2    2
 [5,]    1    1    2    3
 [6,]    1    1    3    3
 [7,]    1    2    2    2
 [8,]    1    2    2    3
 [9,]    1    2    3    3
[10,]    1    3    3    3
[11,]    2    2    2    2
[12,]    2    2    2    3
[13,]    2    2    3    3
[14,]    2    3    3    3
[15,]    3    3    3    3
> x[sample(1:nrow(x),1),]
[1] 1 1 2 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM