简体   繁体   中英

R dplyr left join multiple tables without two separate columns with suffix

Suppose I have a main table x

x <- tibble(id = c(1,2,3,4,5), score = c(100,200,300,100,200))
x
# A tibble: 5 x 2
     id score
  <dbl> <dbl>
1     1   100
2     2   200
3     3   300
4     4   100
5     5   200

and two other tables

y = tibble(id = c(1,2), score_new=c(200,300))
y
# A tibble: 2 x 2
     id score_new
  <dbl>     <dbl>
1     1       200
2     2       300

z = tibble(id = c(3,4), score_new = c(300,400))
z
# A tibble: 2 x 2
     id score_new
  <dbl>     <dbl>
1     3       300
2     4       400

If I join them together it will be like this:

x %>% left_join(y, by =c("id" = "id")) %>% left_join(z, by =c("id" = "id"))
# A tibble: 5 x 4
     id score score_new.x score_new.y
  <dbl> <dbl>       <dbl>       <dbl>
1     1   100         200          NA
2     2   200         300          NA
3     3   300          NA         300
4     4   100          NA         400
5     5   200          NA          NA

But I need score_new to be only one column. How do I do that? Sorry if there are already other similar questions but I really couldn't find them.

You can do that by appending y and z and then joining them.

# Loading required libraries
library(dplyr)

# Create sample df
x <- tibble(id = c(1,2,3,4,5), score = c(100,200,300,100,200))
y = tibble(id = c(1,2), score_new=c(200,300))
z = tibble(id = c(3,4), score_new = c(300,400))

x %>%
  # union y and z and join on x to get new scores
  left_join(union_all(y,z), by = "id")

Similarly you can use bind_rows instead of union_all both gives same results in this scenario.

x %>%
  # union y and z and join on x to get new scores
  left_join(bind_rows(y,z), by = "id")

You can try this approach:

mutate(score_new.x = if_else(is.na(score_new.x),score_new.y,score_new.x)) %>%
select(-score_new.y)

I'm a bit late to the party. But I would opt for this tidyverse -solution,

bind_rows(
        y,z
) %>% left_join(x = x)

Which gives the following output ,

# A tibble: 5 x 3
     id score score_new
  <dbl> <dbl>     <dbl>
1     1   100       200
2     2   200       300
3     3   300       300
4     4   100       400
5     5   200        NA

Note: left_join() has x and y arugments, and here Ive specified that x = x , where the rhs is your data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM