简体   繁体   English

按组在 R 中创建组合

[英]Create Combinations in R by Groups

I want to create a list for my classroom of every possible group of 4 students.我想为每个可能的 4 名学生组为我的教室创建一个列表。 If I have 20 students, how I can I create this, by group, in R where my rows are each combination and there are 20 columns for the full list of student ids and columns 1-4 are "group1", 5-9 are "group2" etc. etc.如果我有 20 个学生,我如何按组在 R 中创建它,其中我的行是每个组合,并且有 20 列用于完整的学生 ID 列表,第 1-4 列是“group1”,第 5-9 列是“group2”等

The below gives a list of possible combinations for each single group of 4 students (x1, x2, x3, and x4).下面列出了每组 4 名学生(x1、x2、x3 和 x4)的可能组合。 Now, for each row listed, what are the possibilities for the other 4 groups of 4 students?现在,对于列出的每一行,其他 4 组 4 名学生的可能性是什么? So, there should be 20 columns (Group1_1:4, Group2_1:4, Group3_1:4, Group4_1:4, Group5_1:4).因此,应该有 20 列(Group1_1:4、Group2_1:4、Group3_1:4、Group4_1:4、Group5_1:4)。

combn(c(1:20), m = 4)

Desired Output期望输出

Combination 1 = Group1[1, 2, 3, 4] Group2[5, 6, 7, 8], Group3[9, 10, 11, 12], etc. 
Combination 2 = Group1[1, 2, 3, 5]... etc. 

There are a lot of posts about combinations out there, it's possible this is already answered and I just couldn't find it.有很多关于组合的帖子,可能已经有人回答了,我只是找不到。 Any help is appreciated!任何帮助表示赞赏!

This is a challenging problem computationally, since I believe there are 2.5 billion possibilities to enumerate.这在计算上是一个具有挑战性的问题,因为我相信有 25 亿种可能性可以枚举。 (If it's mistaken, I'd welcome any insight about where this approach goes wrong.) (如果它是错误的,我欢迎任何有关这种方法出错的地方的见解。)

Depending on how it's stored, a table with all those groupings might require more RAM than most computers can handle.根据它的存储方式,包含所有这些分组的表可能需要比大多数计算机可以处理的更多的 RAM。 I'd be impressed to see an efficient way to create that.看到一种有效的方式来创造它,我会印象深刻。 If we took a "create one combination at a time" approach, it would still take 41 minutes to generate all the possibilities if we could generate 1,000,000 per second, or a month if we could only generate 1,000 per second.如果我们采用“一次创建一个组合”的方法,如果我们每秒可以生成 1,000,000 个,那么生成所有可能性仍然需要 41 分钟,如果我们每秒只能生成 1,000 个,则需要一个月。

EDIT - added partial implementation at the bottom to create any desired grouping from #1 to #2,546,168,625.编辑 - 在底部添加部分实现以创建从 #1 到 #2,546,168,625 的任何所需分组。 For some purposes, this may be almost as good as actually storing the whole sequence, which is very large.出于某些目的,这可能与实际存储非常大的整个序列几乎一样好。


Let's say we are going to make 5 groups of four students each: Group A, B, C, D, and E.假设我们将分成 5 组,每组 4 名学生:A、B、C、D 和 E 组。

Let's define Group A as the group Student #1 is in. They can be paired with any three of the other 19 students.让我们将 A 组定义为学生 #1 所在的组。他们可以与其他 19 名学生中的任意三名配对。 I believe there are 969 such combinations of other students:相信其他同学有969种这样的组合:

> nrow(t(combn(1:19, 3)))
[1] 969

Now there are now 16 students left for other groups.现在还有 16 名学生留在其他小组。 Let's assign the first student not already in Group A into Group B. That might be student 2, 3, 4, or 5. It doesn't matter;让我们将第一个不在 A 组中的学生分配到 B 组。那可能是学生 2、3、4 或 5。没关系; all we need to know is that there are only 15 students that can be paired with that student.我们需要知道的是,只有 15 名学生可以与该学生配对。 There are 455 such combinations:有 455 个这样的组合:

> nrow(t(combn(1:15, 3)))
[1] 455

Now there are 12 student left.现在还剩12个学生。 Again, let's assign the first ungrouped student to Group C, and we have 165 combinations left for them with the other 11 students:同样,让我们​​将第一个未分组的学生分配到 C 组,剩下 165 种组合与其他 11 名学生:

> nrow(t(combn(1:11, 3)))
[1] 165

And we have 8 students left, 7 of whom can be paired with first ungrouped student into Group D in 35 ways:我们还剩下 8 名学生,其中 7 名可以通过 35 种方式与第一个未分组的学生配对进入 D 组:

> nrow(t(combn(1:7, 3)))
[1] 35

And then, once our other groups are determined, there's only one group of four students left, three of whom can be paired with the first ungrouped student:然后,一旦我们确定了其他组,就只剩下四名学生一组,其中三个可以与第一个未分组的学生配对:

> nrow(t(combn(1:3, 3)))
[1] 1

That implies 2.546B combinations:这意味着 2.546B 组合:

> 969*455*165*35*1
[1] 2546168625

Here's a work-in-progress function that produces a grouping based on any arbitrary sequence number.这是一个进行中的函数,它根据任意序列号生成分组。

1) [in progress] Convert sequence number to a vector describing which # combination should be used for Group A, B, C, D, and E. For instance, this should convert #1 to c(1, 1, 1, 1, 1) and #2,546,168,625 to c(969, 455, 165, 35, 1) . 1) [进行中] 将序列号转换为一个向量,描述组 A、B、C、D 和 E 应该使用哪种#组合。例如,这应该将 #1 转换为c(1, 1, 1, 1, 1)和 #2,546,168,625 到c(969, 455, 165, 35, 1)

2) Convert the combinations to a specific output describing the students in each Group. 2) 将组合转换为描述每个组中学生的特定输出。

groupings <- function(seq_nums) {
  students <- 20
  group_size = 4
  grouped <- NULL
  remaining <- 1:20
  seq_nums_pad <- c(seq_nums, 1) # Last group always uses the only possible combination
  for (g in 1:5) {
    group_relative <- 
      c(1, 1 + t(combn(1:(length(remaining) - 1), group_size - 1))[seq_nums_pad[g], ])
    group <- remaining[group_relative]
    print(group)
    grouped = c(grouped, group)
    remaining <-  setdiff(remaining, grouped)
  }
}

> groupings(c(1,1,1,1))
#[1] 1 2 3 4
#[1] 5 6 7 8
#[1]  9 10 11 12
#[1] 13 14 15 16
#[1] 17 18 19 20
> groupings(c(1,1,1,2))
#[1] 1 2 3 4
#[1] 5 6 7 8
#[1]  9 10 11 12
#[1] 13 14 15 17
#[1] 16 18 19 20
> groupings(c(969, 455, 165, 35))   # This one uses the last possibility for
#[1]  1 18 19 20                    #   each grouping.
#[1]  2 15 16 17
#[1]  3 12 13 14
#[1]  4  9 10 11
#[1] 5 6 7 8

This relies heavily on this answer:这在很大程度上依赖于这个答案:

Algorithm that can create all combinations and all groups of those combinations 可以创建所有组合以及这些组合的所有组的算法

One thing to note is that the answer is not that dynamic - it only included a solution for groups of 3. To make it more robust, we can create the code based on the input parameters.需要注意的一件事是,答案不是那么动态——它只包含一个针对 3 组的解决方案。为了使其更健壮,我们可以根据输入参数创建代码。 That is, the following recursive function is created on the fly for groups 3:即,为组 3 动态创建以下递归函数:

group <- function(input, step){
 len <- length(input) 
 combination[1, step] <<- input[1] 

 for (i1 in 2:(len-1)) { 
   combination[2, step] <<- input[i1] 

   for (i2 in (i1+1):(len-0)) { 
     combination[3, step] <<- input[i2] 

     if (step == m) { 
       print(z); result[z, ,] <<- combination 
       z <<- z+1 
     } else { 
       rest <- setdiff(input, input[c(i1,i2, 1)]) 
       group(rest, step +1) #recursive if there are still additional possibilities
   }} 
 } 
}

This takes around 55 seconds to run for N = 16 and k = 4 .对于N = 16k = 4这大约需要 55 秒才能运行。 I'd like to translate it into Rcpp but unfortunately I do not have that skillset.我想将它翻译成Rcpp但不幸的是我没有那个技能。

group_N <- function(input, k = 2) {
  N = length(input)
  m = N/k
  combos <- factorial(N) / (factorial(k)^m * factorial(m))

  result <- array(NA_integer_, dim = c(combos, m, k))
  combination = matrix(NA_integer_, nrow = k, ncol = m)

  z = 1

  group_f_start = 'group <- function(input, step){\n len <- length(input) \n combination[1,  step] <<- input[1] \n '
  i_s <- paste0('i', seq_len(k-1))

  group_f_fors = paste0('for (', i_s, ' in ', c('2', if (length(i_s) != 1) {paste0('(', i_s[-length(i_s)], '+1)')}), ':(len-', rev(seq_len(k)[-k])-1, ')) { \n combination[', seq_len(k)[-1], ', step] <<- input[', i_s, '] \n', collapse = '\n ')

  group_f_inner = paste0('if (step == m) { \n result[z, ,] <<- combination \n z <<- z+1 \n } else { \n rest <- setdiff(input, input[c(',
                         paste0(i_s, collapse = ','),
                         ', 1)]) \n group(rest, step +1) \n }')

  eval(parse(text = paste0(group_f_start, group_f_fors, group_f_inner, paste0(rep('}', times = k), collapse = ' \n '))))

  group(input, 1)
  return(result)
}

Performance表现

system.time({test_1 <- group_N(seq_len(4), 2)})
#   user  system elapsed 
#   0.01    0.00    0.02
library(data.table)

#this funky step is just to better show the groups. the provided
## array is fine.

as.data.table(t(rbindlist(as.data.table(apply(test_1, c(1,3), list)))))
#    V1  V2
#1: 1,2 3,4
#2: 1,3 2,4
#3: 1,4 2,3

system.time({test_1 <- group_N(seq_len(16), 4)})
#   user  system elapsed 
#  55.00    0.19   55.29 

as.data.table(t(rbindlist(as.data.table(apply(test_1, c(1,3), list)))))
#very slow
#                  V1          V2          V3          V4
#      1:     1,2,3,4     5,6,7,8  9,10,11,12 13,14,15,16
#      2:     1,2,3,4     5,6,7,8  9,10,11,13 12,14,15,16
#      3:     1,2,3,4     5,6,7,8  9,10,11,14 12,13,15,16
#      4:     1,2,3,4     5,6,7,8  9,10,11,15 12,13,14,16
#      5:     1,2,3,4     5,6,7,8  9,10,11,16 12,13,14,15
#     ---                                                
#2627621:  1,14,15,16  2,11,12,13  3, 6, 9,10     4,5,7,8
#2627622:  1,14,15,16  2,11,12,13     3,7,8,9  4, 5, 6,10
#2627623:  1,14,15,16  2,11,12,13  3, 7, 8,10     4,5,6,9
#2627624:  1,14,15,16  2,11,12,13  3, 7, 9,10     4,5,6,8
#2627625:  1,14,15,16  2,11,12,13  3, 8, 9,10     4,5,6,7

Currently, this is implemented in the development version of RcppAlgos and will be in the next official release on CRAN .目前,这是在RcppAlgos的开发版本中RcppAlgos ,并将在CRAN的下一个正式版本中实现 This is now officially apart of the production version of RcppAlgos * .这现在正式脱离了RcppAlgos *的生产版本。

library(RcppAlgos)
a <- comboGroups(10, numGroups = 2, retType = "3Darray")

dim(a)
[1] 126   5   2

a[1,,]
     Grp1 Grp2
[1,]    1    6
[2,]    2    7
[3,]    3    8
[4,]    4    9
[5,]    5   10

a[126,,]
     Grp1 Grp2
[1,]    1    2
[2,]    7    3
[3,]    8    4
[4,]    9    5
[5,]   10    6

Or if you prefer matrices:或者,如果您更喜欢矩阵:

a1 <- comboGroups(10, 2, retType = "matrix")

head(a1)
     Grp1 Grp1 Grp1 Grp1 Grp1 Grp2 Grp2 Grp2 Grp2 Grp2
[1,]    1    2    3    4    5    6    7    8    9   10
[2,]    1    2    3    4    6    5    7    8    9   10
[3,]    1    2    3    4    7    5    6    8    9   10
[4,]    1    2    3    4    8    5    6    7    9   10
[5,]    1    2    3    4    9    5    6    7    8   10
[6,]    1    2    3    4   10    5    6    7    8    9

It is also really fast.它也非常快。 You can even generate in parallel with nThreads or Parallel = TRUE (the latter uses one minus the system max threads) for greater efficiency gains:您甚至可以与nThreadsParallel = TRUE生成(后者使用一个减去系统最大线程数)以获得更高的效率:

comboGroupsCount(16, 4)
[1] 2627625

system.time(comboGroups(16, 4, "matrix"))
 user  system elapsed 
0.107   0.030   0.137

system.time(comboGroups(16, 4, "matrix", nThreads = 4))
 user  system elapsed 
0.124   0.067   0.055
                                ## 7 threads on my machine
system.time(comboGroups(16, 4, "matrix", Parallel = TRUE))
 user  system elapsed 
0.142   0.126   0.047

A really nice feature is the ability to generate samples or specific lexicographical combination groups, especially when the number of results is large.一个非常好的功能是能够生成样本或特定的词典组合组,尤其是当结果数量很大时。

comboGroupsCount(factor(state.abb), numGroups = 10)
Big Integer ('bigz') :
[1] 13536281554808237495608549953475109376

mySamp <- comboGroupsSample(factor(state.abb), 
                            numGroups = 10, "3Darray", n = 5, seed = 42)

mySamp[1,,]
     Grp1 Grp2 Grp3 Grp4 Grp5 Grp`6 Grp7 Grp8 Grp9 Grp10
[1,] AL   AK   AR   CA   CO   CT   DE   FL   LA   MD   
[2,] IA   AZ   ME   ID   GA   OR   IL   IN   MS   NM   
[3,] KY   ND   MO   MI   HI   PA   MN   KS   MT   OH   
[4,] TX   RI   SC   NH   NV   WI   NE   MA   NY   TN  
[5,] VA   VT   UT   OK   NJ   WY   WA   NC   SD   WV   
50 Levels: AK AL AR AZ CA CO CT DE FL GA HI IA ID IL IN KS KY LA MA MD ME MI MN MO MS MT NC ND NE NH NJ NM NV NY OH ... WY

firstAndLast <- comboGroupsSample(state.abb, 10, "3Darray",
                                  sampleVec = c("1",
                                                "13536281554808237495608549953475109376"))

firstAndLast[1,,]
     Grp1 Grp2 Grp3 Grp4 Grp5 Grp6 Grp7 Grp8 Grp9 Grp10
[1,] "AL" "CO" "HI" "KS" "MA" "MT" "NM" "OK" "SD" "VA" 
[2,] "AK" "CT" "ID" "KY" "MI" "NE" "NY" "OR" "TN" "WA" 
[3,] "AZ" "DE" "IL" "LA" "MN" "NV" "NC" "PA" "TX" "WV" 
[4,] "AR" "FL" "IN" "ME" "MS" "NH" "ND" "RI" "UT" "WI" 
[5,] "CA" "GA" "IA" "MD" "MO" "NJ" "OH" "SC" "VT" "WY"

firstAndLast[2,,]
     Grp1 Grp2 Grp3 Grp4 Grp5 Grp6 Grp7 Grp8 Grp9 Grp10
[1,] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "FL" "GA" 
[2,] "WA" "TX" "RI" "OH" "NM" "NE" "MN" "ME" "IA" "HI" 
[3,] "WV" "UT" "SC" "OK" "NY" "NV" "MS" "MD" "KS" "ID" 
[4,] "WI" "VT" "SD" "OR" "NC" "NH" "MO" "MA" "KY" "IL" 
[5,] "WY" "VA" "TN" "PA" "ND" "NJ" "MT" "MI" "LA" "IN"

And finally, generating all 2,546,168,625 combinations groups of 20 people into 5 groups (what the OP asked for) can be achieved in under a minute using the lower and upper arguments:最后,使用lowerupper参数可以在不到一分钟的时间内将所有2,546,168,625组 20 人组合成 5 组(OP 要求的):

system.time(aPar <- parallel::mclapply(seq(1, 2546168625, 969969), function(x) {
     combs <- comboGroups(20, 5, "3Darray", lower = x, upper = x + 969968)
     ### do something
     dim(combs)
}, mc.cores = 6))
   user  system elapsed 
217.667  22.932  48.482

sum(sapply(aPar, "[", 1))
[1] 2546168625

Although I started working on this problem over a year ago , this question was a huge inspiration for getting this formalized in a package.虽然我在一年前开始研究这个问题,但这个问题是一个巨大的灵感,让这个问题在一个包中正式化。

* I am the author of RcppAlgos *我是RcppAlgos的作者

Here's an example for smaller numbers.这是较小数字的示例。 I don't think this will scale well for 20 students我认为这对 20 名学生来说不太合适

total_students = 4
each_group = 2
total_groups = total_students/each_group

if (total_students %% each_group == 0) {
    library(arrangements)

    group_id = rep(1:total_groups, each = each_group)

    #There is room to increase efficiency here by generating only relevant permutations
    temp = permutations(1:total_students, total_students)
    temp = unique(t(apply(temp, 1, function(i) {
        x = group_id[i]
        match(x, unique(x))
    })))

    dimnames(temp) = list(COMBO = paste0("C", 1:NROW(temp)),
                          Student = paste0("S", 1:NCOL(temp)))
} else {
    cat("Total students not multiple of each_group")
    temp = NA
}
#> Warning: package 'arrangements' was built under R version 3.5.3
temp
#>      Student
#> COMBO S1 S2 S3 S4
#>    C1  1  1  2  2
#>    C2  1  2  1  2
#>    C3  1  2  2  1

Created on 2019-09-02 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2019 年 9 月 2 日创建

The total number of possible ways is given by following function ( from here )可能的方法总数由以下函数给出(来自此处

foo = function(N, k) {
    #N is total number or people, k is number of people in each group
    if (N %% k == 0) {
        m = N/k
        factorial(N)/(factorial(k)^m * factorial(m))
    } else {
        stop("N is not a multiple of n")
    }
}

foo(4, 2)
#[1] 3

foo(20, 4)
#[1] 2546168625

For groups of 4 people from a total of 20 people, the number of possible arrangements is massive.对于总共 20 人中的 4 人小组,可能的安排数量是巨大的。

This code below works.下面的代码有效。

# Create list of the 20 records
list <- c(1:20)

# Generate all combinations including repetitions
c <- data.frame(expand.grid(rep(list(list), 4))); rm(list)
c$combo <- paste(c$Var1, c$Var2, c$Var3, c$Var4)
# Remove repetitions
c <- subset(c, c$Var1 != c$Var2 & c$Var1 != c$Var3 & c$Var1 != c$Var4 & c$Var2 != c$Var3 & c$Var2 != c$Var4 & c$Var3 != c$Var4)

# Create common group labels (ex. abc, acb, bac, bca, cab, cba would all have "abc" as their group label).
key <- data.frame(paste(c$Var1, c$Var2, c$Var3, c$Var4))
key$group  <- apply(key, 1, function(x) paste(sort(unlist(strsplit(x, " "))), collapse = " "))
c$group <- key$group; rm(key)

# Sort by common group label and id combos by group
c <- c[order(c$group),]
c$Var1 <- NULL; c$Var2 <- NULL; c$Var3 <- NULL; c$Var4 <- NULL;
c$rank <- rep(1:24)

# Pivot
c <- reshape(data=c,idvar="group", v.names = "combo", timevar = "rank", direction="wide")

So you could get all the combinations with the expand.grid function just adding the vector of data four times.因此,您可以使用expand.grid函数获得所有组合,只需将数据向量相加四次即可。 Then the result will have combinations like c(1,1,1,1) so i remove each row that have any duplicated value and the last part is just making the combinations.然后结果将具有像c(1,1,1,1)这样的组合,所以我删除了具有任何重复值的每一行,最后一部分只是进行组合。 It is 2 loops and it is quite slow but it will get what you want.它是 2 个循环,速度很慢,但它会得到你想要的。 It could be speed up with the Rcpp package.使用Rcpp包可以加快速度。 The code is:代码是:

ids = 1:20
d2 = expand.grid(ids,ids,ids,ids)
## Remove rows with duplicated values
pos_use = apply(apply(d2,1,duplicated),2,function(x) all(x == F))
d2_temp = t(apply(d2[pos_use,],1,sort))
list_temp = list()
pos_quitar = NULL
for(i in 1:nrow(d2_temp)){
  pos_quitar = c(pos_quitar,i)
  ini_comb = d2_temp[i,]
  d2_temp_use  = d2_temp[-pos_quitar,]
  temp_comb = ini_comb
  for(j in 2:5){
    pos_quitar_new = which(apply(d2_temp_use,1,function(x) !any(temp_comb%in%x)))[1]
    temp_comb = c(temp_comb,d2_temp_use[pos_quitar_new,])
  }
  pos_quitar = c(pos_quitar,pos_quitar_new)
  list_temp[[i]] = temp_comb
}

list_temp

Here's a function that uses only base R functions for generating possible combinations of groups.这是一个仅使用base R 函数来生成可能的组组合的函数。

Group_Assignment_Function <- function (Identifiers, Number_of_Items_in_Each_Group, Number_of_Groups) {
  Output <- vector(mode = 'list', length = Number_of_Groups)
  Possible_Groups_Function <- function (x) {
    if (is.list(x)) {
      lapply(x, Possible_Groups_Function)
    } else if (!is.list(x)) {
      as.list(as.data.frame(combn(x, Number_of_Items_in_Each_Group)))
    }
  }
  Remaining_Items_Function <- function (x, y) {
    if (!is.list(y)) {
      lapply(x, function (z) {
        setdiff(y, z)
      })
    } else if (is.list(y)) {
      mapply(Remaining_Items_Function, x = x, y = y, SIMPLIFY = F)
    }
  }
  All_Possible_Groups_Function <- function (x) {
    for (i in seq_len(Number_of_Groups - 1)) {
      if (i == 1) {
        Group_Possibilities <- Possible_Groups_Function(x)
      } else if (i > 1) {
        Group_Possibilities <- Possible_Groups_Function(Remaining_Items)
      }
      Output[[i]] <- Group_Possibilities
      if (!all(sapply(Group_Possibilities, is.list))) {
        Remaining_Items <- lapply(Group_Possibilities, function (y) {
          setdiff(x, y)
        })
      } else if (all(sapply(Group_Possibilities, is.list))) {
        Remaining_Items <- Remaining_Items_Function(Group_Possibilities, Remaining_Items)
      }
    }
    if (Number_of_Groups == 1) {
      Output[[Number_of_Groups]] <- Possible_Groups_Function(x)
    } else if (Number_of_Groups > 1) {
      Output[[Number_of_Groups]] <- Possible_Groups_Function(Remaining_Items)
    }
    Output
  }
  All_Possible_Groups <- All_Possible_Groups_Function(Identifiers)
  Repitition_Times <- choose(length(Identifiers) - (Number_of_Items_in_Each_Group * (0:(Number_of_Groups - 1))), Number_of_Items_in_Each_Group)
  Repitition_Times <- c(Repitition_Times[2:length(Repitition_Times)], 1)
  Repitition_Times <- lapply((length(Repitition_Times) - seq_len(length(Repitition_Times))) + 1, function (x) {
    rev(Repitition_Times)[1:x]
  })
  Repitition_Times <- lapply(Repitition_Times, function (y) {
    Reduce(`*`, y)
  })
  All_Possible_Groups <- lapply(All_Possible_Groups, function(x) {
    z <- sapply(x, function (y) {
      class(y)[1] == "list"
    })
    w <- c(x[!z], unlist(x[z], recursive = F))
    if (sum(z)){
      Recall(w)
    } else if (!sum(z)) {
      w
    }
  })
  All_Possible_Groups <- mapply(function (x, y) {
    x[rep(seq_len(length(x)), each = y)]
  }, x = All_Possible_Groups, y = Repitition_Times, SIMPLIFY = F)
  All_Possible_Groups <- lapply(seq_len(unique(sapply(All_Possible_Groups, length))), function (x) {
    lapply(All_Possible_Groups,"[[", x)
  })
  List_of_Possible_Groups <- lapply(All_Possible_Groups, function (x) {
    names(x) <- paste0("Group_", seq_len(Number_of_Groups))
    x
  })
  names(List_of_Possible_Groups) <- NULL
  Ordered_List_of_Possible_Groups_1 <- lapply(List_of_Possible_Groups, function (x) {
    lapply(x, sort)
  })
  Ordered_List_of_Possible_Groups_2 <- lapply(Ordered_List_of_Possible_Groups_1, function (x) {
    order(sapply(x, function (y) {
      y[1]
    }))
  })
  Ordered_List_of_Possible_Groups_1 <- mapply(function (x, y) {
    x[y]
  }, x = Ordered_List_of_Possible_Groups_1, y = Ordered_List_of_Possible_Groups_2, SIMPLIFY = F)
  Ordered_List_of_Possible_Groups_1 <- lapply(Ordered_List_of_Possible_Groups_1, function (x) {
    do.call('c', x)
      })
  Ordered_List_of_Possible_Groups_1 <- lapply(Ordered_List_of_Possible_Groups_1, function (x) {
    names(x) <- NULL
    x
  })
  List_of_Possible_Groups <- List_of_Possible_Groups[-c(which(duplicated(Ordered_List_of_Possible_Groups_1)))]
  names(List_of_Possible_Groups) <- paste("Possibility", seq_len(length(List_of_Possible_Groups)), sep = "_")
  List_of_Possible_Groups
}

Here's an example of how to use it:以下是如何使用它的示例:

Identifiers <- as.character(1:5)
Number_of_Items_in_Each_Group <- 2
Number_of_Groups <- 2
Group_Assignment_Function(Identifiers = Identifiers, Number_of_Items_in_Each_Group = Number_of_Items_in_Each_Group, Number_of_Groups = Number_of_Groups)
# $Possibility_1
# $Possibility_1$Group_1
# [1] "1" "2"
# 
# $Possibility_1$Group_2
# [1] "3" "4"
# 
# 
# $Possibility_2
# $Possibility_2$Group_1
# [1] "1" "2"
# 
# $Possibility_2$Group_2
# [1] "3" "5"
# 
# 
# $Possibility_3
# $Possibility_3$Group_1
# [1] "1" "2"
# 
# $Possibility_3$Group_2
# [1] "4" "5"
# 
# 
# $Possibility_4
# $Possibility_4$Group_1
# [1] "1" "3"
# 
# $Possibility_4$Group_2
# [1] "2" "4"
# 
# 
# $Possibility_5
# $Possibility_5$Group_1
# [1] "1" "3"
# 
# $Possibility_5$Group_2
# [1] "2" "5"
# 
# 
# $Possibility_6
# $Possibility_6$Group_1
# [1] "1" "3"
# 
# $Possibility_6$Group_2
# [1] "4" "5"
# 
# 
# $Possibility_7
# $Possibility_7$Group_1
# [1] "1" "4"
# 
# $Possibility_7$Group_2
# [1] "2" "3"
# 
# 
# $Possibility_8
# $Possibility_8$Group_1
# [1] "1" "4"
# 
# $Possibility_8$Group_2
# [1] "2" "5"
# 
# 
# $Possibility_9
# $Possibility_9$Group_1
# [1] "1" "4"
# 
# $Possibility_9$Group_2
# [1] "3" "5"
# 
# 
# $Possibility_10
# $Possibility_10$Group_1
# [1] "1" "5"
# 
# $Possibility_10$Group_2
# [1] "2" "3"
# 
# 
# $Possibility_11
# $Possibility_11$Group_1
# [1] "1" "5"
# 
# $Possibility_11$Group_2
# [1] "2" "4"
# 
# 
# $Possibility_12
# $Possibility_12$Group_1
# [1] "1" "5"
# 
# $Possibility_12$Group_2
# [1] "3" "4"
# 
# 
# $Possibility_13
# $Possibility_13$Group_1
# [1] "2" "3"
# 
# $Possibility_13$Group_2
# [1] "4" "5"
# 
# 
# $Possibility_14
# $Possibility_14$Group_1
# [1] "2" "4"
# 
# $Possibility_14$Group_2
# [1] "3" "5"
# 
# 
# $Possibility_15
# $Possibility_15$Group_1
# [1] "2" "5"
# 
# $Possibility_15$Group_2
# [1] "3" "4"

It takes a while for larger numbers of items.大量项目需要一段时间。 If anyone has a better base R solution I'd love to see it.如果有人有更好的base R 解决方案,我很乐意看到它。 I'm sure there are more efficient ways since this way generates all the possible permutations and then gets rid of ones that don't actually have different things in each group.我确信有更有效的方法,因为这种方法会生成所有可能的排列,然后摆脱每组中实际上没有不同的排列。

This code below gives all unique combinations for 4 selected from 20 without duplicates.下面的代码给出了从 20 个中选出的 4 个的所有独特组合,没有重复。

x <- c(1:20)
combinations <- data.frame(t(combn(x, 4)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM