简体   繁体   English

将值分配给R中两个列值的组合

[英]Assign value to combination of two column values in R

I'm looking for an easier way to assign a unique value to a combination of string values from two columns, where: 我正在寻找一种更简单的方法来将唯一值分配给来自两列的字符串值的组合,其中:

  • (colA=2 and colB=4) and (colA=4 and colB=2) are given the same ID (colA = 2和colB = 4)和(colA = 4和colB = 2)被赋予相同的ID
  • (colA=1 and colB=1) can't happen (colA = 1和colB = 1)不会发生
  • colA and colB are strings colA和colB是字符串

Here is a minimal example of a solution that would work if colA and colB where numerics: 这是一个最小的解决方案示例,该解决方案 colA和colB包含数字的情况下适用:

set.seed(3)
a <- sample(1:5, 20, replace = T) 
b <- sample(1:5, 20, replace = T) 

df<- data.frame(a, b)

library(dplyr)

df<- df %>% 
      filter(a!=b) %>% 
      mutate(abCombination = a*b) %>%
      arrange(abCombination)

df$abFactor <- factor(df$abCombination, labels = c("combination 1", "combination 2",
                                                   "combination 3", "combination 4",
                                                   "combination 5", "combination 6",
                                                   "combination 7"))

I feel that this is an easy task but can't think of: 我认为这是一项容易的任务,但无法想到:

  1. solution working with strings 使用字符串的解决方案
  2. more elegant (concise) way to code it. 更优雅(简洁)的编码方式。

Assuming that we are looking for a more general approach that works on both numeric/non-numeric , one option is to use the pmin/pmax to paste the elements and then do the factor 假设我们正在寻找一种适用于numeric/non-numeric的更通用的方法,一种选择是使用pmin/pmax paste元素,然后执行factor

df %>% 
   filter(a != b) %>%
   mutate(abCombination = sprintf('%s %s', pmin(a, b), pmax(a, b))) %>% 
   arrange(abCombination) %>% 
   mutate(abFactor = factor(abCombination, levels = unique(abCombination), 
        labels = paste('Combination', seq_len(n_distinct(abCombination))) ))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM