简体   繁体   中英

R Overwrite column values with non NA values from column in separate dataframe

I have a dataframe 'df1' with a lot of columns, but the ones of interest are:

Number Code
1
2
3
10
11 AMRO
4
277
2100 BLPH

And I have another dataframe 'df2' with a lot of columns, but the ones of interest are:

Number Code
1 AMCR
2 AMCR
3 BANO
10 BAEA
12 AMRO
4 NA
277 NA
2100 NA

I want matching values in the 'Number' columns of 'df1' and 'df2' to lead to values in the 'Code' column in 'df2' to overwrite the 'Code' values in 'df1' as long as the 'Code' values in 'df2' don't contain an NA, so that the final result of 'df1' looks like:

Number Code
1 AMCR
2 AMCR
3 BANO
10 BAEA
11 AMRO
4
277
2100 BLPH

Thank you for your help!

We can do

library(powerjoin)
power_left_join(df1, df2, by = "Number", conflict = coalesce)

-output

Number Code
1      1 AMCR
2      2 AMCR
3      3 BANO
4     10 BAEA
5     11 AMRO
6      4 <NA>
7    277 <NA>
8   2100 BLPH

Or to do an overwrite, use data.table

library(data.table)
setDT(df1)[df2, Code := fcoalesce(Code, i.Code), on = .(Number)]

-output

> df1
   Number   Code
    <int> <char>
1:      1   AMCR
2:      2   AMCR
3:      3   BANO
4:     10   BAEA
5:     11   AMRO
6:      4   <NA>
7:    277   <NA>
8:   2100   BLPH

data

df1 <- structure(list(Number = c(1L, 2L, 3L, 10L, 11L, 4L, 277L, 2100L
), Code = c(NA, NA, NA, NA, "AMRO", NA, NA, "BLPH")), 
class = "data.frame", row.names = c(NA, 
-8L))

df2 <- structure(list(Number = c(1L, 2L, 3L, 10L, 12L, 4L, 277L, 2100L
), Code = c("AMCR", "AMCR", "BANO", "BAEA", "AMRO", NA, NA, NA
)), class = "data.frame", row.names = c(NA, -8L))

Here is an alternative approach using bind_cols :

library(dplyr)

bind_cols(df1, df2) %>% 
  mutate(Code = coalesce(Code...2, Code...4)) %>% 
  select(Number = Number...1, Code)

 Number Code
1      1 AMCR
2      2 AMCR
3      3 BANO
4     10 BAEA
5     11 AMRO
6      4 <NA>
7    277 <NA>
8   2100 BLPH

Here is a solution playing with dplyr full_join and inner_join

library(dplyr)
df1 %>% 
  full_join(df2) %>% na.omit() %>% 
  full_join(df1 %>% inner_join(df2)) %>% 
  filter(Number %in% df1$Number) %>%
  arrange(Number)

Output


#>   Number Code
#> 1      1 AMCR
#> 2      2 AMCR
#> 3      3 BANO
#> 4      4 <NA>
#> 5     10 BAEA
#> 6     11 AMRO
#> 7    277 <NA>
#> 8   2100 BLPH

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM