R中嵌套循環的更快替代方法

Question

這是場景：我有一個示例，其中將主題分為三組。 接下來，將每個組的主題歸為一組，從而形成由每個組的主題組成的幾個“三胞胎”。 我想計算來自給定組（1、2或3）的主題與不同原始組的主題i分組的次數。

這是一個簡單的代碼示例：

data <- cbind(c(1:9), c(rep("Group 1", 3), rep("Group 2", 3), rep("Group 3", 3)))
data <- data.frame(data)
names(data) <- c("ID", "Group")

groups.of.3 <- data.frame(rbind(c(1,4,7),c(2,4,7),c(2,5,7),c(3,6,8),c(3,6,9)))

N <- nrow(data)
n1 <- nrow(data[data$Group == "Group 1", ])
n2 <- nrow(data[data$Group == "Group 2", ])
n3 <- nrow(data[data$Group == "Group 3", ])

# Check the number of times a subject from a group is grouped with a subject i 
# from another group

M1 <- matrix(0, nrow = N, ncol = n1) 
M2 <- matrix(0, nrow = N, ncol = n2)
M3 <- matrix(0, nrow = N, ncol = n3)
for (i in 1:N){
  if (data$Group[i] != "Group 1"){
    for (j in 1:n1){
      M1[i,j] <- nrow(groups.of.3[groups.of.3[,1] == j &
                                  (groups.of.3[,2] == i |
                                  groups.of.3[,3] == i), ])
    }
  }
  if (data$Group[i] != "Group 2"){
    for (j in 1:n2){
      M2[i,j] <- nrow(groups.of.3[groups.of.3[,2] == (n1 + j) &
                                    (groups.of.3[,1] == i | 
                                       groups.of.3[,3] == i), ])
    }
  }
  if (data$Group[i] != "Group 3"){
    for (j in 1:n3){
      M3[i,j] <- nrow(groups.of.3[groups.of.3[,3] == (n1 + n2 + j) & 
                                    (groups.of.3[,1] == i |
                                    groups.of.3[,2] == i), ])
    }
  }
}

因此，我有9個主題，每組三個。 然后，將每個組的主題隨后分組在一起（允許重復放置）。 對於更多的主題，這花費了更長的時間，我想知道是否有一種更快的選擇避免使用for循環。

例如，矩陣M1由第1組中的受試者隨后與任何其他組中的其他受試者分組的次數組成：

M1
      [,1] [,2] [,3]
 [1,]    0    0    0
 [2,]    0    0    0
 [3,]    0    0    0
 [4,]    1    1    0
 [5,]    0    1    0
 [6,]    0    0    2
 [7,]    1    2    0
 [8,]    0    0    1
 [9,]    0    0    1

因此，第3列代表第1組的三個主題，行代表所有主題-條目是第1組的每個主題與其他任何主題分組的次數（例如，根據第3組， 3在與主題6相同的組中出現兩次，而主題1與主題7一起出現一次。

謝謝你的幫助！

Answer 1

像這樣嗎

library(tidyr)
library(dplyr)
data <- data %>% 
  mutate(ID = as.numeric(levels(ID))[ID])
tmp <- groups.of.3 %>% 
  add_rownames() %>% 
  gather("X", "Person", -rowname) %>% 
  inner_join(data, by = c("Person" = "ID"))
tmp %>% 
  inner_join(tmp, by = c("rowname")) %>% 
  filter(Group.x != Group.y) %>% 
  group_by(Person.x, Group.x, Group.y) %>% 
  summarise(N = n()) %>% 
  spread(key = Group.y, value = N, fill = 0)

  Person.x Group.x Group 1 Group 2 Group 3
     (dbl)  (fctr)   (dbl)   (dbl)   (dbl)
1        1 Group 1       0       1       1
2        2 Group 1       0       2       2
3        3 Group 1       0       2       2
4        4 Group 2       2       0       2
5        5 Group 2       1       0       1
6        6 Group 2       2       0       2
7        7 Group 3       3       3       0
8        8 Group 3       1       1       0
9        9 Group 3       1       1       0

Answer 2

for循環並不是天生就慢：

# coerce the fields in groups.of.3 to factor
for(i in 1:3)
    groups.of.3[,i]  <-  as.factor(groups.of.3[,i],levels =data$ID)


M <- matrix(0, N, N) 
out  <-  NULL
for(i in 1:(3-1))
    for(j in (i+1):3)
        M  <-  M + table(groups.of.3[,i],groups.of.3[,j])
M1  <-  M[,as.integer(data$Group)==1]
M2  <-  M[,as.integer(data$Group)==2]
M3  <-  M[,as.integer(data$Group)==3]

Answer 3

我將對Thierry的答案做一個很小的修改，以回答我自己的問題：

庫（tidyr）庫（dplyr）

data <- data %>%
  mutate(ID = as.numeric(levels(ID))[ID])
tmp <- groups.of.3 %>%
  add_rownames() %>%
  gather("X", "Person", -rowname) %>%
  inner_join(data, by = c("Person" = "ID"))
tmp %>% 
  inner_join(tmp, by = c("rowname")) %>%
  filter(Group.x != Group.y) %>%
  group_by(Person.x, Group.x, Person.y) %>%
  summarise(N = n()) %>%
  spread(key = Person.y, value = N, fill = 0)

這給出了以下輸出，其中包括前一個for循環的M1，M2和M3，並將它們連接在一起。

Source: local data frame [9 x 11]

  Person.x Group.x     1     2     3     4     5     6     7     8     9
     (dbl)  (fctr) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl)
1        1 Group 1     0     0     0     1     0     0     1     0     0
2        2 Group 1     0     0     0     1     1     0     2     0     0
3        3 Group 1     0     0     0     0     0     2     0     1     1
4        4 Group 2     1     1     0     0     0     0     2     0     0
5        5 Group 2     0     1     0     0     0     0     1     0     0
6        6 Group 2     0     0     2     0     0     0     0     1     1
7        7 Group 3     1     2     0     2     1     0     0     0     0
8        8 Group 3     0     0     1     0     0     1     0     0     0
9        9 Group 3     0     0     1     0     0     1     0     0     0

R中嵌套循環的更快替代方法

問題描述

3 個解決方案

解決方案1
1 2015-10-12 23:00:58

解決方案2
1 2015-10-12 23:30:30

解決方案3
0 2015-10-13 17:26:35

R中嵌套循環的更快替代方法

問題描述

3 個解決方案

解決方案1 1 2015-10-12 23:00:58

解決方案2 1 2015-10-12 23:30:30

解決方案3 0 2015-10-13 17:26:35

解決方案1
1 2015-10-12 23:00:58

解決方案2
1 2015-10-12 23:30:30

解決方案3
0 2015-10-13 17:26:35