简体   繁体   English

用R中的新数据帧更新旧数据帧

[英]Updating old dataframe with new dataframe in R

I am working to update an old dataframe with a data from a new dataframe.我正在努力用新数据帧中的数据更新旧数据帧。

I found this option , it works for some of the fields, but not all. 我找到了这个选项,它适用于某些领域,但不是全部。 Not sure how to alter that as it is beyond my skill set.不知道如何改变它,因为它超出了我的技能范围。 I tried removing the is.na(x) portion of the ifelse code and that did not work.我尝试删除ifelse代码的is.na(x)部分,但没有奏效。

df_old <- data.frame(
      bb = as.character(c("A", "A", "A", "B", "B", "B")),
      y = as.character(c("i", "ii", "ii", "i", "iii", "i")),
      z = 1:6,
      aa = c(NA, NA, 123, NA, NA, 12))

df_new <- data.frame(
      bb = as.character(c("A", "A", "A", "B", "A", "A")),
      z = 1:6,
      aa = c(NA, NA, 123, 1234, NA, 12))

cols <- names(df_new)[names(df_new) != "z"]

df_old[,cols] <- mapply(function(x, y) ifelse(is.na(x), y[df_new$z == df_old$z], x), df_old[,cols], df_new[,cols])

The code also changes my bb variable from a character vector to a numeric.该代码还将我的bb变量从字符向量更改为数字。 Do I need another call to mapply focusing on specific variable bb ?我是否需要再次调用 mapply 专注于特定变量bb

To update the aa and bb columns you can approach this using a join via merge() .要更新aabb列,您可以通过merge()使用连接来解决此问题。 This assumes column z is the index for these data frames.这假设列z是这些数据框的索引。

# join on `z` column
df_final<- merge(df_old, df_new, by = c("z"))
# replace NAs with new values for column `aa` from `df_new`
df_final$aa <- ifelse(is.na(df_final$aa.x), df_final$aa.y, df_final$aa.x)
# choose new values for column `bb` from `df_new`
df_final$bb <- df_final$bb.y
df_final<- df_final[,c("bb", "z", "y", "aa")]

df_final
  bb z   y   aa
1  A 1   i   NA
2  A 2  ii   NA
3  A 3  ii  123
4  B 4   i 1234
5  A 5 iii   NA
6  A 6   i   12

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM