在R中具有多个条件的左联接

Question

我正在尝试将ID替换为其各自的值。 问题在于，每个id根据先前的列type具有不同的值，如下所示：

>df
  type id 
1  q1   1
2  q1   2
3  q2   1
4  q2   3
5  q3   1
6  q3   2

这是类型ID及其值：

>q1
  id value
1 1  yes
2 2  no

>q2 
   id value
1  1  one hour
2  2  two hours
3  3  more than two hours

>q3
  id value
1 1  blue
2 2  yellow

我已经尝试过这样的事情：

df <- left_join(subset(df, type %in% c("q1"), q1, by = "id"))

但这会删除其他值。

我想知道如何做一个（或一种） one liner solution ，因为有20多个带有类型描述的向量。

关于如何做的任何想法？

这是我期望的df：

>df
  type id value
1  q1   1 yes
2  q1   2 no
3  q2   1 one hour
4  q2   3 more than two hours
5  q3   1 blue
6  q3   2 yellow

Answer 1

您可以加入多个变量。 您给出的示例df实际上将为此创建一个合适的查找表：

value_lookup <- data.frame(
  type = c('q1', 'q1', 'q2', 'q2', 'q3', 'q3'),
  id = c(1, 2, 1, 3, 1, 2),
  value = c('yes', 'no', 'one hour', 'more than two hours', 'blue', 'yellow')
)

然后，您只需合并type和id ：

df <- left_join(df, value_lookup, by = c('type', 'id'))

通常，当我需要像这样的查找表时，我会将其存储在CSV中，而不是将其全部写在代码中，但是可以做一些适合您的事情。

Answer 2

tempList = split(df, df$type)
do.call(rbind,
          lapply(names(tempList), function(nm)
              merge(tempList[[nm]], get(nm))))
#  id type               value
#1  1   q1                 yes
#2  2   q1                  no
#3  1   q2            one hour
#4  3   q2 more than two hours
#5  1   q3                blue
#6  2   q3              yellow

Answer 3

获得的“Q \\ d +”在一个data.frame对象标识符的值list ，一起结合成一个单一的data.frame与bind_rows在创建“类型”列作为标识符名和right_join与数据集对象“DF”

library(tidyverse)
mget(paste0("q", 1:3)) %>% 
    bind_rows(.id = 'type') %>% 
    right_join(df)
#  type id               value
#1   q1  1                 yes
#2   q1  2                  no
#3   q2  1            one hour
#4   q2  3 more than two hours
#5   q3  1                blue
#6   q3  2              yellow

Answer 4

您可以通过一系列左联接来做到这一点：

df1 = left_join(df, q1, by='id') %>% filter(type=="q1")
> df1
  type id value
1   q1  1   yes
2   q1  2    no


df2 = left_join(df, q2, by='id') %>% filter(type=="q2")
> df2
  type id               value
1   q2  1            one hour
2   q2  3 more than two hours

df3 = left_join(df, q3, by='id') %>% filter(type=="q3")
> df3
  type id  value
1   q3  1   blue
2   q3  2 yellow

> rbind(df1,df2,df3)
  type id               value
1   q1  1                 yes
2   q1  2                  no
3   q2  1            one hour
4   q2  3 more than two hours
5   q3  1                blue
6   q3  2              yellow

一种班轮是：

rbind(left_join(df, q1, by='id') %>% filter(type=="q1"),
        left_join(df, q2, by='id') %>% filter(type=="q2"),
            left_join(df, q3, by='id') %>% filter(type=="q3"))

如果您有更多的向量，则可能应该遍历向量类型的名称，并按如下方式逐一执行left_join和bind_rows：

vecQs = c(paste("q", seq(1,3,1),sep="")) #Types of variables q1, q2 ...
result = tibble()

#Execute left_join for the types and store it in result.
for(i in vecQs) {       
     result = bind_rows(result, left_join(df,eval(as.symbol(i)) , by='id') %>% filter(type==!!i))
}

这将给出：

> result
# A tibble: 6 x 3
  type     id value              
  <chr> <int> <chr>              
1 q1        1 yes                
2 q1        2 no                 
3 q2        1 one hour           
4 q2        3 more than two hours
5 q3        1 blue               
6 q3        2 yellow

在R中具有多个条件的左联接

问题描述

4 个解决方案

解决方案1
0 已采纳 2019-01-21 17:07:27

解决方案2
0 2019-01-21 17:07:29

解决方案3
0 2019-01-21 17:09:14

解决方案4
0 2019-01-21 17:19:38

在R中具有多个条件的左联接

问题描述

4 个解决方案

解决方案1 0 已采纳 2019-01-21 17:07:27

解决方案2 0 2019-01-21 17:07:29

解决方案3 0 2019-01-21 17:09:14

解决方案4 0 2019-01-21 17:19:38

解决方案1
0 已采纳 2019-01-21 17:07:27

解决方案2
0 2019-01-21 17:07:29

解决方案3
0 2019-01-21 17:09:14

解决方案4
0 2019-01-21 17:19:38