简体   繁体   中英

Assign value to combination of two column values in R

I'm looking for an easier way to assign a unique value to a combination of string values from two columns, where:

  • (colA=2 and colB=4) and (colA=4 and colB=2) are given the same ID
  • (colA=1 and colB=1) can't happen
  • colA and colB are strings

Here is a minimal example of a solution that would work if colA and colB where numerics:

set.seed(3)
a <- sample(1:5, 20, replace = T) 
b <- sample(1:5, 20, replace = T) 

df<- data.frame(a, b)

library(dplyr)

df<- df %>% 
      filter(a!=b) %>% 
      mutate(abCombination = a*b) %>%
      arrange(abCombination)

df$abFactor <- factor(df$abCombination, labels = c("combination 1", "combination 2",
                                                   "combination 3", "combination 4",
                                                   "combination 5", "combination 6",
                                                   "combination 7"))

I feel that this is an easy task but can't think of:

  1. solution working with strings
  2. more elegant (concise) way to code it.

Assuming that we are looking for a more general approach that works on both numeric/non-numeric , one option is to use the pmin/pmax to paste the elements and then do the factor

df %>% 
   filter(a != b) %>%
   mutate(abCombination = sprintf('%s %s', pmin(a, b), pmax(a, b))) %>% 
   arrange(abCombination) %>% 
   mutate(abFactor = factor(abCombination, levels = unique(abCombination), 
        labels = paste('Combination', seq_len(n_distinct(abCombination))) ))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM