[英]R Overwrite column values with non NA values from column in separate dataframe
I have a dataframe 'df1' with a lot of columns, but the ones of interest are:我有一个 dataframe 'df1' 有很多列,但感兴趣的是:
Number![]() |
Code![]() |
---|---|
1 ![]() |
|
2 ![]() |
|
3 ![]() |
|
10 ![]() |
|
11 ![]() |
AMRO ![]() |
4 ![]() |
|
277 ![]() |
|
2100 ![]() |
BLPH ![]() |
And I have another dataframe 'df2' with a lot of columns, but the ones of interest are:我还有另一个 dataframe 'df2' 有很多列,但感兴趣的是:
Number![]() |
Code![]() |
---|---|
1 ![]() |
AMCR ![]() |
2 ![]() |
AMCR ![]() |
3 ![]() |
BANO![]() |
10 ![]() |
BAEA ![]() |
12 ![]() |
AMRO ![]() |
4 ![]() |
NA![]() |
277 ![]() |
NA![]() |
2100 ![]() |
NA![]() |
I want matching values in the 'Number' columns of 'df1' and 'df2' to lead to values in the 'Code' column in 'df2' to overwrite the 'Code' values in 'df1' as long as the 'Code' values in 'df2' don't contain an NA, so that the final result of 'df1' looks like:我希望“df1”和“df2”的“数字”列中的匹配值导致“df2”中“代码”列中的值覆盖“df1”中的“代码”值,只要“代码” 'df2' 中的值不包含 NA,因此 'df1' 的最终结果如下所示:
Number![]() |
Code![]() |
---|---|
1 ![]() |
AMCR ![]() |
2 ![]() |
AMCR ![]() |
3 ![]() |
BANO![]() |
10 ![]() |
BAEA ![]() |
11 ![]() |
AMRO ![]() |
4 ![]() |
|
277 ![]() |
|
2100 ![]() |
BLPH ![]() |
Thank you for your help!谢谢您的帮助!
We can do我们可以做的
library(powerjoin)
power_left_join(df1, df2, by = "Number", conflict = coalesce)
-output -输出
Number Code
1 1 AMCR
2 2 AMCR
3 3 BANO
4 10 BAEA
5 11 AMRO
6 4 <NA>
7 277 <NA>
8 2100 BLPH
Or to do an overwrite, use data.table
或者进行覆盖,使用
data.table
library(data.table)
setDT(df1)[df2, Code := fcoalesce(Code, i.Code), on = .(Number)]
-output -输出
> df1
Number Code
<int> <char>
1: 1 AMCR
2: 2 AMCR
3: 3 BANO
4: 10 BAEA
5: 11 AMRO
6: 4 <NA>
7: 277 <NA>
8: 2100 BLPH
df1 <- structure(list(Number = c(1L, 2L, 3L, 10L, 11L, 4L, 277L, 2100L
), Code = c(NA, NA, NA, NA, "AMRO", NA, NA, "BLPH")),
class = "data.frame", row.names = c(NA,
-8L))
df2 <- structure(list(Number = c(1L, 2L, 3L, 10L, 12L, 4L, 277L, 2100L
), Code = c("AMCR", "AMCR", "BANO", "BAEA", "AMRO", NA, NA, NA
)), class = "data.frame", row.names = c(NA, -8L))
Here is an alternative approach using bind_cols
:这是使用
bind_cols
的替代方法:
library(dplyr)
bind_cols(df1, df2) %>%
mutate(Code = coalesce(Code...2, Code...4)) %>%
select(Number = Number...1, Code)
Number Code
1 1 AMCR
2 2 AMCR
3 3 BANO
4 10 BAEA
5 11 AMRO
6 4 <NA>
7 277 <NA>
8 2100 BLPH
Here is a solution playing with dplyr
full_join
and inner_join
这是一个使用
dplyr
full_join
和inner_join
的解决方案
library(dplyr)
df1 %>%
full_join(df2) %>% na.omit() %>%
full_join(df1 %>% inner_join(df2)) %>%
filter(Number %in% df1$Number) %>%
arrange(Number)
#> Number Code
#> 1 1 AMCR
#> 2 2 AMCR
#> 3 3 BANO
#> 4 4 <NA>
#> 5 10 BAEA
#> 6 11 AMRO
#> 7 277 <NA>
#> 8 2100 BLPH
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.