简体   繁体   English

R:限制排列比使用for循环更有效

[英]R: Restricted permutations more efficient way than using for loops

I am trying to permute a char vector a of variable length picking 3 elements every time, without repetition. 我试图置换一个char向量a可变长度采摘每次3个元素,而不会重复。 Ordering counts only for the first element but doesn't for second and third (eg abc != bac != cab, but abc = acb and bca = bac). 订购仅针对第一个元素,但不针对第二个和第三个元素(例如abc!= bac!= cab,但abc = acb和bca = bac)。 Each set of 3 permuted elements should be a row in a dataframe b . 每组3个置换元素应该是数据帧b的一行。

A vector with letters a , b , c , d , e would result in this expected output: 带字母abcde的向量将导致此预期输出:

abc
abd
abe
acd
ace
ade

bac
bad
bae
bcd
bce
bde

cab 
cad
cae
cbd
cbe
cde

dab
dac
dae
dbc
dbe
dce

eab
eac
ead
ebc
ebd
ecd

Using 3 for loops I think I was able to achieve this output, but it is slow if the vector is long. 使用3 for循环我认为我能够实现此输出,但如果向量很长则它很慢。

a = letters[1:5]
aL = length(a)
b <- data.frame(var1 = character(),
                var2 = character(), 
                var3 = character(), 
                stringsAsFactors = FALSE) 


# restricted permutations for moderation
pracma::tic()
for(i in 1:aL){
  for(j in 1:(aL-1)){
    for(k in (j+1):aL){
      if(j != i & k != i) { 
        b <- rbind(b, data.frame(a[i], a[j], a[k])) }
    }
  }
}
pracma::toc()
#> elapsed time is 0.070000 seconds
b
#>    a.i. a.j. a.k.
#> 1     a    b    c
#> 2     a    b    d
#> 3     a    b    e
#> 4     a    c    d
#> 5     a    c    e
#> 6     a    d    e
#> 7     b    a    c
#> 8     b    a    d
#> 9     b    a    e
#> 10    b    c    d
#> 11    b    c    e
#> 12    b    d    e
#> 13    c    a    b
#> 14    c    a    d
#> 15    c    a    e
#> 16    c    b    d
#> 17    c    b    e
#> 18    c    d    e
#> 19    d    a    b
#> 20    d    a    c
#> 21    d    a    e
#> 22    d    b    c
#> 23    d    b    e
#> 24    d    c    e
#> 25    e    a    b
#> 26    e    a    c
#> 27    e    a    d
#> 28    e    b    c
#> 29    e    b    d
#> 30    e    c    d

Created on 2019-07-17 by the reprex package (v0.2.1) reprex包创建于2019-07-17(v0.2.1)

How can I achieve the same outcome in less time. 如何在更短的时间内取得同样的结果。 Is recursion faster? 递归更快吗?

Any help is greatly appreciated. 任何帮助是极大的赞赏。 Thank you. 谢谢。

I propose the following solution: 我提出以下解决方案:

a = letters[1:5]
A = t(combn(a,3)) # create all possible three-letter combinations, 
                  # disregarding the order 
Full = rbind(A, A[,3:1], A[,c(2,3,1)]) # put every of the elements of the 
                                       # differing combinations in first place once

Here's one option for your specific example: 以下是您的具体示例的一个选项:

library(gtools)
library(dplyr)

# example vector
vec = letters[1:5]

# vectorised function to rearrange elements (based on your restriction)
f = function(x1,x2,x3) paste0(c(x1, sort(c(x2,x3))), collapse = " ")
f = Vectorize(f)

permutations(length(vec), 3, vec) %>%      # get permutations
  data.frame(., stringsAsFactors = F) %>%  # save as data frame
  mutate(vec = f(X1,X2,X3)) %>%            # apply function to each row
  distinct(vec, .keep_all = T)             # keep distinct vec values

#    X1 X2 X3   vec
# 1   a  b  c a b c
# 2   a  b  d a b d
# 3   a  b  e a b e
# 4   a  c  d a c d
# 5   a  c  e a c e
# 6   a  d  e a d e
# 7   b  a  c b a c
# ...

Not clear if you want your output to be 3 separate columns with 1 element each, or one column with the vector, so I'm keeping both for you to choose from. 不清楚你是否希望你的输出是3个单独的列,每个列有1个元素,或者一个列有向量,所以我保留两个供你选择。 You can keep columns {X1, X2, X3} or just vec . 您可以保留列{X1, X2, X3}或仅保留vec

The following is a straightforward rewrite of the triple for loop as a triple lapply loop. 以下是三重for循环的简单重写,作为三重lapply循环。

t1 <- system.time({
for(i in 1:aL){
  for(j in 1:(aL-1)){
    for(k in (j+1):aL){
      if(j != i & k != i) { 
        b <- rbind(b, data.frame(a[i], a[j], a[k])) }
    }
  }
}
})

t2 <- system.time({
d <- lapply(1:aL, function(i){
  tmp <- lapply(1:(aL-1), function(j){
    tmp <- lapply((j+1):aL, function(k){
      if(j != i & k != i) c(a[i], a[j], a[k])
    })
    do.call(rbind, tmp)
  })
  do.call(rbind, tmp)
})
d <- do.call(rbind.data.frame, d)
names(d) <- paste("a", 1:3, sep = ".")
})

all.equal(b, d)
#[1] "Names: 3 string mismatches"

rbind(t1, t2)
#   user.self sys.self elapsed user.child sys.child
#t1     0.051        0   0.051          0         0
#t2     0.017        0   0.018          0         0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM