简体   繁体   中英

All possible pairs in tidyverse

I would like to create all possible pairs between rows of a dataframe without duplicates (ie A_B is the same as B_A).

Is there an elegant way to do this in tidyverse?

Example data:

df <- tibble(
  id   = 1:5,
  name = c( 'Alice', 'Bob', 'Charlie', 'Diane', 'Fred' )
)

Expected output:

> df_pairs
# A tibble: 10 x 2
   id    name         
   <chr> <chr>        
 1 1_2   Alice_Bob    
 2 1_3   Alice_Charlie
 3 1_4   Alice_Diane  
 4 1_5   Alice_Fred   
 5 2_3   Bob_Charlie  
 6 2_4   Bob_Diane    
 7 2_5   Bob_Fred     
 8 3_4   Charlie_Diane
 9 3_5   Charlie_Fred 
10 4_5   Diane_Fred 

I was able to do it with crossing , but I'd like to know if there is an easier way:

df_pairs <- df %>% select( id1 = id, name1 = name ) %>% 
  crossing(df %>% select(id2 = id, name2 = name) ) %>%
  dplyr::filter( id1 < id2) %>%
  unite( id, id1, id2 ) %>%
  unite( name, name1, name2 )

Looks like you need to use combn to avoid duplicates.

get_combn <- function(x) {
  combn(x, 2, paste, collapse = "_")
}

as.data.frame(lapply(df, get_combn))

#    id          name
#1  1_2     Alice_Bob
#2  1_3 Alice_Charlie
#3  1_4   Alice_Diane
#4  1_5    Alice_Fred
#5  2_3   Bob_Charlie
#6  2_4     Bob_Diane
#7  2_5      Bob_Fred
#8  3_4 Charlie_Diane
#9  3_5  Charlie_Fred
#10 4_5    Diane_Fred

which can also be applied with purrr::map_df

purrr::map_df(df, get_combn)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM