擴展列表以包括組內所有可能的成對組合

Question

我目前正在進行隨機化，其中對特定人群的個體進行抽樣並將其放入定義大小的組中。 結果是如下所示的數據框：

Ind Group
Sally   1
Bob 1
Sue 1
Joe 2
Jeff    2
Jess    2
Mary    2
Jim 3
James   3

是否有一個功能允許我擴展數據集以顯示組配對中的每個可能？ （下面的期望輸出）。 配對不需要是互惠的。

Group   Ind1    Ind2
1   Sally   Bob
1   Sally   Sue
1   Sue Bob
2   Joe Jeff
2   Joe Jess
2   Joe Mary
2   Jeff    Jess
2   Jess    Mary
2   Jeff    Mary
3   Jim James

我覺得必須有一種方法可以在dplyr中做到這一點，但對於我的生活，我似乎無法解決它。

Answer 1

另一種dplyr ＆ tidyr方法：管道有點長，但對我來說，爭吵感覺更直接。 首先將每個組中的所有記錄組合在一起。 接下來，將所有名稱匯總並按字母順序排列，以便能夠消除倒數/重復。 然后最后將結果再分開。

left_join(dt, dt, by = "Group") %>% 
    filter(Ind.x != Ind.y) %>% 
    rowwise %>%
    mutate(name = toString(sort(c(Ind.x,Ind.y)))) %>% 
    select(Group, name) %>% 
    distinct %>% 
    separate(name, into = c("Ind1", "Ind2")) %>% 
    arrange(Group, Ind1, Ind2)

從每組中所有記錄的弱交叉連接開始
filter掉自聯接
收集每行中的所有名稱，對它們進行排序，並在名稱列中將它們放在一起。
既然名稱是按字母順序排列的，則刪除按字母順序排列的倒數
將數據拉回到單獨的列中。

 # A tibble: 10 x 3 Group Ind1 Ind2 * <int> <chr> <chr> 1 1 Bob Sally 2 1 Sally Sue 3 1 Bob Sue 4 2 Jeff Joe 5 2 Jess Joe 6 2 Joe Mary 7 2 Jeff Jess 8 2 Jeff Mary 9 2 Jess Mary 10 3 James Jim

Answer 2

這是一個使用data.table的選項。 轉換為data.table （ setDT(dt) ），執行按“組”分組的交叉連接（ CJ ）並刪除duplicated元素

library(data.table)
setDT(dt)[, CJ(Ind1 = Ind, Ind2 = Ind, unique = TRUE)[Ind1 != Ind2], 
             Group][!duplicated(data.table(pmax(Ind1, Ind2), pmin(Ind1, Ind2)))]
#   Group  Ind1  Ind2
#1:     1   Bob Sally
#2:     1   Bob   Sue
#3:     1 Sally   Sue
#4:     2  Jeff  Jess
#5:     2  Jeff   Joe
#6:     2  Jeff  Mary
#7:     2  Jess   Joe
#8:     2  Jess  Mary
#9:     2   Joe  Mary
#10:    3 James   Jim

或者用combn由“集團”

setDT(dt)[, {temp <- combn(Ind, 2); .(Ind1 = temp[1,], Ind2 = temp[2,])}, Group]

Answer 3

使用dplyr的解決方案。 我們可以使用group_by並do將combn函數應用於每個組並將結果組合以形成數據框。

library(dplyr)
dt2 <- dt %>%
  group_by(Group) %>%
  do(as_data_frame(t(combn(.$Ind, m = 2)))) %>%
  ungroup() %>%
  setNames(sub("V", "Ind", colnames(.)))
dt2

# # A tibble: 10 x 3
#    Group  Ind1  Ind2
#    <int> <chr> <chr>
#  1     1 Sally   Bob
#  2     1 Sally   Sue
#  3     1   Bob   Sue
#  4     2   Joe  Jeff
#  5     2   Joe  Jess
#  6     2   Joe  Mary
#  7     2  Jeff  Jess
#  8     2  Jeff  Mary
#  9     2  Jess  Mary
# 10     3   Jim James

數據

dt <- read.table(text = "Ind Group
Sally   1
Bob 1
Sue 1
Joe 2
Jeff    2
Jess    2
Mary    2
Jim 3
James   3",
                 header = TRUE, stringsAsFactors = FALSE)

擴展列表以包括組內所有可能的成對組合

問題描述

3 個解決方案

解決方案1
3 2017-11-14 04:04:05

解決方案2
2 已采納 2017-11-14 05:29:00

解決方案3
1 2017-11-14 02:11:51

擴展列表以包括組內所有可能的成對組合

問題描述

3 個解決方案

解決方案1 3 2017-11-14 04:04:05

解決方案2 2 已采納 2017-11-14 05:29:00

解決方案3 1 2017-11-14 02:11:51

解決方案1
3 2017-11-14 04:04:05

解決方案2
2 已采納 2017-11-14 05:29:00

解決方案3
1 2017-11-14 02:11:51