R：限制排列比使用for循環更有效

Question

我試圖置換一個char向量a可變長度采摘每次3個元素，而不會重復。 訂購僅針對第一個元素，但不針對第二個和第三個元素（例如abc！= bac！= cab，但abc = acb和bca = bac）。 每組3個置換元素應該是數據幀b的一行。

帶字母a ， b ， c ， d ， e的向量將導致此預期輸出：

abc
abd
abe
acd
ace
ade

bac
bad
bae
bcd
bce
bde

cab 
cad
cae
cbd
cbe
cde

dab
dac
dae
dbc
dbe
dce

eab
eac
ead
ebc
ebd
ecd

使用3 for循環我認為我能夠實現此輸出，但如果向量很長則它很慢。

a = letters[1:5]
aL = length(a)
b <- data.frame(var1 = character(),
                var2 = character(), 
                var3 = character(), 
                stringsAsFactors = FALSE) 


# restricted permutations for moderation
pracma::tic()
for(i in 1:aL){
  for(j in 1:(aL-1)){
    for(k in (j+1):aL){
      if(j != i & k != i) { 
        b <- rbind(b, data.frame(a[i], a[j], a[k])) }
    }
  }
}
pracma::toc()
#> elapsed time is 0.070000 seconds
b
#>    a.i. a.j. a.k.
#> 1     a    b    c
#> 2     a    b    d
#> 3     a    b    e
#> 4     a    c    d
#> 5     a    c    e
#> 6     a    d    e
#> 7     b    a    c
#> 8     b    a    d
#> 9     b    a    e
#> 10    b    c    d
#> 11    b    c    e
#> 12    b    d    e
#> 13    c    a    b
#> 14    c    a    d
#> 15    c    a    e
#> 16    c    b    d
#> 17    c    b    e
#> 18    c    d    e
#> 19    d    a    b
#> 20    d    a    c
#> 21    d    a    e
#> 22    d    b    c
#> 23    d    b    e
#> 24    d    c    e
#> 25    e    a    b
#> 26    e    a    c
#> 27    e    a    d
#> 28    e    b    c
#> 29    e    b    d
#> 30    e    c    d

^{由reprex包創建於2019-07-17（v0.2.1）}

如何在更短的時間內取得同樣的結果。 遞歸更快嗎？

任何幫助是極大的贊賞。 謝謝。

Answer 1

我提出以下解決方案：

a = letters[1:5]
A = t(combn(a,3)) # create all possible three-letter combinations, 
                  # disregarding the order 
Full = rbind(A, A[,3:1], A[,c(2,3,1)]) # put every of the elements of the 
                                       # differing combinations in first place once

Answer 2

以下是您的具體示例的一個選項：

library(gtools)
library(dplyr)

# example vector
vec = letters[1:5]

# vectorised function to rearrange elements (based on your restriction)
f = function(x1,x2,x3) paste0(c(x1, sort(c(x2,x3))), collapse = " ")
f = Vectorize(f)

permutations(length(vec), 3, vec) %>%      # get permutations
  data.frame(., stringsAsFactors = F) %>%  # save as data frame
  mutate(vec = f(X1,X2,X3)) %>%            # apply function to each row
  distinct(vec, .keep_all = T)             # keep distinct vec values

#    X1 X2 X3   vec
# 1   a  b  c a b c
# 2   a  b  d a b d
# 3   a  b  e a b e
# 4   a  c  d a c d
# 5   a  c  e a c e
# 6   a  d  e a d e
# 7   b  a  c b a c
# ...

不清楚你是否希望你的輸出是3個單獨的列，每個列有1個元素，或者一個列有向量，所以我保留兩個供你選擇。 您可以保留列{X1, X2, X3}或僅保留vec 。

Answer 3

以下是三重for循環的簡單重寫，作為三重lapply循環。

t1 <- system.time({
for(i in 1:aL){
  for(j in 1:(aL-1)){
    for(k in (j+1):aL){
      if(j != i & k != i) { 
        b <- rbind(b, data.frame(a[i], a[j], a[k])) }
    }
  }
}
})

t2 <- system.time({
d <- lapply(1:aL, function(i){
  tmp <- lapply(1:(aL-1), function(j){
    tmp <- lapply((j+1):aL, function(k){
      if(j != i & k != i) c(a[i], a[j], a[k])
    })
    do.call(rbind, tmp)
  })
  do.call(rbind, tmp)
})
d <- do.call(rbind.data.frame, d)
names(d) <- paste("a", 1:3, sep = ".")
})

all.equal(b, d)
#[1] "Names: 3 string mismatches"

rbind(t1, t2)
#   user.self sys.self elapsed user.child sys.child
#t1     0.051        0   0.051          0         0
#t2     0.017        0   0.018          0         0

R：限制排列比使用for循環更有效

問題描述

3 個解決方案

解決方案1
5 已采納 2019-07-17 12:14:59

解決方案2
2 2019-07-17 12:13:22

解決方案3
2 2019-07-17 12:17:42

R：限制排列比使用for循環更有效

問題描述

3 個解決方案

解決方案1 5 已采納 2019-07-17 12:14:59

解決方案2 2 2019-07-17 12:13:22

解決方案3 2 2019-07-17 12:17:42

解決方案1
5 已采納 2019-07-17 12:14:59

解決方案2
2 2019-07-17 12:13:22

解決方案3
2 2019-07-17 12:17:42