[英]Left join with multiple conditions in R
I'm trying to replace ids for their respective values. 我正在尝试将ID替换为其各自的值。 The problem is that each id has a different value according to the previous column
type
, like this: 问题在于,每个id根据先前的列
type
具有不同的值,如下所示:
>df
type id
1 q1 1
2 q1 2
3 q2 1
4 q2 3
5 q3 1
6 q3 2
Here's the type ids with its value: 这是类型ID及其值:
>q1
id value
1 1 yes
2 2 no
>q2
id value
1 1 one hour
2 2 two hours
3 3 more than two hours
>q3
id value
1 1 blue
2 2 yellow
I've tried something like this: 我已经尝试过这样的事情:
df <- left_join(subset(df, type %in% c("q1"), q1, by = "id"))
But it removes the other values. 但这会删除其他值。
I' like to know how to do a one liner solution
(or kind of) because there are more than 20 vectors with types description. 我想知道如何做一个(或一种)
one liner solution
,因为有20多个带有类型描述的向量。
Any ideias on how to do it? 关于如何做的任何想法?
This is the df i'm expecting: 这是我期望的df:
>df
type id value
1 q1 1 yes
2 q1 2 no
3 q2 1 one hour
4 q2 3 more than two hours
5 q3 1 blue
6 q3 2 yellow
You can join on more than one variable. 您可以加入多个变量。 The example df you give would actually make a suitable lookup table for this:
您给出的示例df实际上将为此创建一个合适的查找表:
value_lookup <- data.frame(
type = c('q1', 'q1', 'q2', 'q2', 'q3', 'q3'),
id = c(1, 2, 1, 3, 1, 2),
value = c('yes', 'no', 'one hour', 'more than two hours', 'blue', 'yellow')
)
Then you just merge on both type
and id
: 然后,您只需合并
type
和id
:
df <- left_join(df, value_lookup, by = c('type', 'id'))
Usually when I need a lookup table like that I store it in a CSV rather than write it all out in the code, but do whatever suits you. 通常,当我需要像这样的查找表时,我会将其存储在CSV中,而不是将其全部写在代码中,但是可以做一些适合您的事情。
tempList = split(df, df$type)
do.call(rbind,
lapply(names(tempList), function(nm)
merge(tempList[[nm]], get(nm))))
# id type value
#1 1 q1 yes
#2 2 q1 no
#3 1 q2 one hour
#4 3 q2 more than two hours
#5 1 q3 blue
#6 2 q3 yellow
Get the values of 'q\\d+' data.frame object identifiers in a list
, bind them together into a single data.frame with bind_rows
while creating the 'type' column as the identifier name and right_join
with the dataset object 'df' 获得的“Q \\ d +”在一个data.frame对象标识符的值
list
,一起结合成一个单一的data.frame与bind_rows
在创建“类型”列作为标识符名和right_join
与数据集对象“DF”
library(tidyverse)
mget(paste0("q", 1:3)) %>%
bind_rows(.id = 'type') %>%
right_join(df)
# type id value
#1 q1 1 yes
#2 q1 2 no
#3 q2 1 one hour
#4 q2 3 more than two hours
#5 q3 1 blue
#6 q3 2 yellow
You can do it by a series of left joins: 您可以通过一系列左联接来做到这一点:
df1 = left_join(df, q1, by='id') %>% filter(type=="q1")
> df1
type id value
1 q1 1 yes
2 q1 2 no
df2 = left_join(df, q2, by='id') %>% filter(type=="q2")
> df2
type id value
1 q2 1 one hour
2 q2 3 more than two hours
df3 = left_join(df, q3, by='id') %>% filter(type=="q3")
> df3
type id value
1 q3 1 blue
2 q3 2 yellow
> rbind(df1,df2,df3)
type id value
1 q1 1 yes
2 q1 2 no
3 q2 1 one hour
4 q2 3 more than two hours
5 q3 1 blue
6 q3 2 yellow
One liner would be: 一种班轮是:
rbind(left_join(df, q1, by='id') %>% filter(type=="q1"),
left_join(df, q2, by='id') %>% filter(type=="q2"),
left_join(df, q3, by='id') %>% filter(type=="q3"))
If you have more vectors then probably you should loop through the names of vector types and execute left_join and bind_rows one by one as: 如果您有更多的向量,则可能应该遍历向量类型的名称,并按如下方式逐一执行left_join和bind_rows:
vecQs = c(paste("q", seq(1,3,1),sep="")) #Types of variables q1, q2 ...
result = tibble()
#Execute left_join for the types and store it in result.
for(i in vecQs) {
result = bind_rows(result, left_join(df,eval(as.symbol(i)) , by='id') %>% filter(type==!!i))
}
This will give: 这将给出:
> result
# A tibble: 6 x 3
type id value
<chr> <int> <chr>
1 q1 1 yes
2 q1 2 no
3 q2 1 one hour
4 q2 3 more than two hours
5 q3 1 blue
6 q3 2 yellow
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.