R 用单独的列中的非 NA 值覆盖列值 dataframe

Question

I have a dataframe 'df1' with a lot of columns, but the ones of interest are:我有一个 dataframe 'df1' 有很多列，但感兴趣的是：

Number数字	Code代码
1 1个
2 2个
3 3个
10 10
11 11	AMRO AMRO
4 4个
277 277
2100 2100	BLPH BLPH

And I have another dataframe 'df2' with a lot of columns, but the ones of interest are:我还有另一个 dataframe 'df2' 有很多列，但感兴趣的是：

Number数字	Code代码
1 1个	AMCR AMCR
2 2个	AMCR AMCR
3 3个	BANO巴诺
10 10	BAEA BAEA
12 12	AMRO AMRO
4 4个	NA北美
277 277	NA北美
2100 2100	NA北美

I want matching values in the 'Number' columns of 'df1' and 'df2' to lead to values in the 'Code' column in 'df2' to overwrite the 'Code' values in 'df1' as long as the 'Code' values in 'df2' don't contain an NA, so that the final result of 'df1' looks like:我希望“df1”和“df2”的“数字”列中的匹配值导致“df2”中“代码”列中的值覆盖“df1”中的“代码”值，只要“代码” 'df2' 中的值不包含 NA，因此 'df1' 的最终结果如下所示：

Number数字	Code代码
1 1个	AMCR AMCR
2 2个	AMCR AMCR
3 3个	BANO巴诺
10 10	BAEA BAEA
11 11	AMRO AMRO
4 4个
277 277
2100 2100	BLPH BLPH

Thank you for your help!谢谢您的帮助！

Answer 1

We can do我们可以做的

library(powerjoin)
power_left_join(df1, df2, by = "Number", conflict = coalesce)

-output -输出

Number Code
1      1 AMCR
2      2 AMCR
3      3 BANO
4     10 BAEA
5     11 AMRO
6      4 <NA>
7    277 <NA>
8   2100 BLPH

Or to do an overwrite, use data.table或者进行覆盖，使用data.table

library(data.table)
setDT(df1)[df2, Code := fcoalesce(Code, i.Code), on = .(Number)]

-output -输出

> df1
   Number   Code
    <int> <char>
1:      1   AMCR
2:      2   AMCR
3:      3   BANO
4:     10   BAEA
5:     11   AMRO
6:      4   <NA>
7:    277   <NA>
8:   2100   BLPH

data数据

df1 <- structure(list(Number = c(1L, 2L, 3L, 10L, 11L, 4L, 277L, 2100L
), Code = c(NA, NA, NA, NA, "AMRO", NA, NA, "BLPH")), 
class = "data.frame", row.names = c(NA, 
-8L))

df2 <- structure(list(Number = c(1L, 2L, 3L, 10L, 12L, 4L, 277L, 2100L
), Code = c("AMCR", "AMCR", "BANO", "BAEA", "AMRO", NA, NA, NA
)), class = "data.frame", row.names = c(NA, -8L))

Answer 2

Here is an alternative approach using bind_cols :这是使用bind_cols的替代方法：

library(dplyr)

bind_cols(df1, df2) %>% 
  mutate(Code = coalesce(Code...2, Code...4)) %>% 
  select(Number = Number...1, Code)

 Number Code
1      1 AMCR
2      2 AMCR
3      3 BANO
4     10 BAEA
5     11 AMRO
6      4 <NA>
7    277 <NA>
8   2100 BLPH

Answer 3

Here is a solution playing with dplyr full_join and inner_join这是一个使用dplyr full_join和inner_join的解决方案

library(dplyr)

df1 %>% 
  full_join(df2) %>% na.omit() %>% 
  full_join(df1 %>% inner_join(df2)) %>% 
  filter(Number %in% df1$Number) %>%
  arrange(Number)

Output Output


#>   Number Code
#> 1      1 AMCR
#> 2      2 AMCR
#> 3      3 BANO
#> 4      4 <NA>
#> 5     10 BAEA
#> 6     11 AMRO
#> 7    277 <NA>
#> 8   2100 BLPH

R 用单独的列中的非 NA 值覆盖列值 dataframe

问题描述

3 个解决方案

解决方案1
2 已采纳 2022-11-28 22:33:31

data数据

解决方案2
1 2022-11-28 22:41:12

解决方案3
1 2022-11-28 23:12:14

Output Output

R 用单独的列中的非 NA 值覆盖列值 dataframe

问题描述

3 个解决方案

解决方案1 2 已采纳 2022-11-28 22:33:31

data数据

解决方案2 1 2022-11-28 22:41:12

解决方案3 1 2022-11-28 23:12:14

Output Output

解决方案1
2 已采纳 2022-11-28 22:33:31

解决方案2
1 2022-11-28 22:41:12

解决方案3
1 2022-11-28 23:12:14