Obtaining samples of pairs so that there is not repetition of any value in any of the combinations (R)

Question

I have a dataframe that looks like this:

Now I would like to obtain n samples of m pairs ( x,y ), so that there is not repetition of any value in any of the combinations and in any of the element orders.

For example, for m=2: sample [(1,3),(4,3)] is not valid solution (3 repeated in y), sample [(1,3),(4,1)] is not valid solution either (1 repeated in first x and second y), but samples [(1,2),(3,4)] or [(1,1),(2,2)] are examples of valid solutions.

I have been trying this, but I do not know how to find and remove duplicates of x in y.

y <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
x <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)

df <- data.frame(x, y)

subset(df[sample(nrow(df)),], !duplicated(x) & !duplicated(y))

Answer 1

Here's a function that generates a list of n samples of m elements taken without repeats from vectors x and y:

unique_sets <- function(x, y, m, n) 
{
  lapply(seq(n),  function(z)
                  {
                    xs <- sample(x, m)
                    ys <- sample(unique(y[!(y %in% xs)]), m)
                    mapply(c, xs, ys, SIMPLIFY = FALSE)
                  })
}

So now you can do

y <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
x <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)

set.seed(69)
unique_sets(x, y, m = 2, n = 3)
#> [[1]]
#> [[1]][[1]]
#> [1] 4 2
#> 
#> [[1]][[2]]
#> [1] 1 3
#> 
#> 
#> [[2]]
#> [[2]][[1]]
#> [1] 4 1
#> 
#> [[2]][[2]]
#> [1] 2 3
#> 
#> 
#> [[3]]
#> [[3]][[1]]
#> [1] 4 3
#> 
#> [[3]][[2]]
#> [1] 2 1

^{Created on 2020-04-16 by the reprex package (v0.3.0)}

Answer 2

You could probably start with something like this

res <- cbind(df[sample(nrow(df)),], df[sample(nrow(df)),])

and then this

res[,c("x1NotOk", "y1NotOk") ] <- t(apply(res, 1, function(x) x[1:2] %in% x[3:4]))

which will give you something like this

  > res
   x y x.1 y.1 x1NotOk y1NotOk
4  1 4   2   3   FALSE   FALSE
10 3 2   1   2   FALSE    TRUE
5  2 1   4   3   FALSE   FALSE
2  1 2   2   1    TRUE    TRUE
16 4 4   1   1   FALSE   FALSE
....

After that you drop the ones where either x1NotOk or y1NotOk (are TRUE) eg -which(res$x1NotOk | res$y1NotOk) .

Answer 3

Maybe you can try the code like below

m <- 2
n <- 5
res <- replicate(n,
                 Map(c,
                     x <- sample(unique(df$X),m),
                     y <- list(sample(setdiff(df$Y,x),m),x)[[sample(2,1)]]),
                 simplify = FALSE)

DATA

df <- rev(expand.grid(Y=1:4,X=1:4))

Obtaining samples of pairs so that there is not repetition of any value in any of the combinations (R)

Question

3 answers

solution1
1 ACCPTED 2020-04-16 19:48:04

solution2
1 2020-04-16 20:24:01

solution3
1 2020-04-16 20:38:05

Obtaining samples of pairs so that there is not repetition of any value in any of the combinations (R)

Question

3 answers

solution1 1 ACCPTED 2020-04-16 19:48:04

solution2 1 2020-04-16 20:24:01

solution3 1 2020-04-16 20:38:05

solution1
1 ACCPTED 2020-04-16 19:48:04

solution2
1 2020-04-16 20:24:01

solution3
1 2020-04-16 20:38:05