简体   繁体   中英

How to get all combinations of 2 from a grouped column in a data frame

I could write a loop to do this, but I was wondering how this might be done in R with dplyr. I have a data frame with two columns. Column 1 is the group, Column 2 is the value. I would like a data frame that has every combination of two values from each group in two separate columns. For example:

input = data.frame(col1 = c(1,1,1,2,2), col2 = c("A","B","C","E","F"))
input
#>   col1 col2
#> 1    1    A
#> 2    1    B
#> 3    1    C
#> 4    2    E
#> 5    2    F

and have it return

output = data.frame(col1 = c(1,1,1,2), col2 = c("A","B","C","E"), col3 = c("B","C","A","F"))
output
#>   col1 col2 col3
#> 1    1    A    B
#> 2    1    B    C
#> 3    1    C    A
#> 4    2    E    F

I'd like to be able to include it within dplyr syntax:

input %>%
  group_by(col1) %>%
  ???

I tried writing my own function that produces a data frame of combinations like what I need from a vector and sent it into the group_map function, but didn't have success:

combos = function(x, ...) {
  x = t(combn(x, 2))
  return(as.data.frame(x))
}

input %>%
  group_by(col1) %>%
  group_map(.f = combos)

Produced an error.

Any suggestions?

You can do :

library(dplyr)

data <- input %>%
  group_by(col1) %>%
  summarise(col2 = t(combn(col2, 2)))
cbind(data[1], data.frame(data$col2))

#   col1 X1    X2   
#  <dbl> <chr> <chr>
#1     1 A     B    
#2     1 A     C    
#3     1 B     C    
#4     2 E     F    
input %>%
  group_by(col1) %>%
  nest(data=-col1) %>% 
  mutate(out= map(data, ~ t(combn(unlist(.x), 2)))) %>% 
  unnest(out)  %>% select(-data)
# A tibble: 4 x 2
# Groups:   col1 [2]
   col1 out[,1] [,2] 
  <dbl> <chr>   <chr>
1     1 A       B    
2     1 A       C    
3     1 B       C    
4     2 E       F  

Or :

combos = function(x, ...) {
  return(tibble(col1=x[[1,1]],col2=t(combn(unlist(x[[2]], use.names=F), 2))))
}

input %>%
  group_by(col1) %>%
  group_map(.f = combos, .keep=T) %>% invoke(rbind,.) %>% tibble 
# A tibble: 4 x 2
   col1 col2[,1] [,2] 
  <dbl> <chr>    <chr>
1     1 A        B    
2     1 A        C    
3     1 B        C    
4     2 E        F    

Thank you! In terms of parsimony, I like both the answer from Ben

input %>% 
  group_by(col1) %>% 
  do(data.frame(t(combn(.$col2, 2))))

and Ronak

data <- input %>%
  group_by(col1) %>%
  summarise(col2 = t(combn(col2, 2)))
cbind(data[1], data.frame(data$col2))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM