[英]Updating old dataframe with new dataframe in R
I am working to update an old dataframe with a data from a new dataframe.我正在努力用新数据帧中的数据更新旧数据帧。
I found this option , it works for some of the fields, but not all. 我找到了这个选项,它适用于某些领域,但不是全部。 Not sure how to alter that as it is beyond my skill set.
不知道如何改变它,因为它超出了我的技能范围。 I tried removing the
is.na(x)
portion of the ifelse
code and that did not work.我尝试删除
ifelse
代码的is.na(x)
部分,但没有奏效。
df_old <- data.frame(
bb = as.character(c("A", "A", "A", "B", "B", "B")),
y = as.character(c("i", "ii", "ii", "i", "iii", "i")),
z = 1:6,
aa = c(NA, NA, 123, NA, NA, 12))
df_new <- data.frame(
bb = as.character(c("A", "A", "A", "B", "A", "A")),
z = 1:6,
aa = c(NA, NA, 123, 1234, NA, 12))
cols <- names(df_new)[names(df_new) != "z"]
df_old[,cols] <- mapply(function(x, y) ifelse(is.na(x), y[df_new$z == df_old$z], x), df_old[,cols], df_new[,cols])
The code also changes my bb
variable from a character vector to a numeric.该代码还将我的
bb
变量从字符向量更改为数字。 Do I need another call to mapply focusing on specific variable bb
?我是否需要再次调用 mapply 专注于特定变量
bb
?
To update the aa
and bb
columns you can approach this using a join via merge()
.要更新
aa
和bb
列,您可以通过merge()
使用连接来解决此问题。 This assumes column z
is the index for these data frames.这假设列
z
是这些数据框的索引。
# join on `z` column
df_final<- merge(df_old, df_new, by = c("z"))
# replace NAs with new values for column `aa` from `df_new`
df_final$aa <- ifelse(is.na(df_final$aa.x), df_final$aa.y, df_final$aa.x)
# choose new values for column `bb` from `df_new`
df_final$bb <- df_final$bb.y
df_final<- df_final[,c("bb", "z", "y", "aa")]
df_final
bb z y aa
1 A 1 i NA
2 A 2 ii NA
3 A 3 ii 123
4 B 4 i 1234
5 A 5 iii NA
6 A 6 i 12
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.