简体   繁体   中英

Overwrite values of existing dataframe

If have a data frame which holds 2 type of observations, coded by IDs ( id.1 , id.2 ) with corresponding values ( val.1 , val.2 ) and several other data represented in this example by val.other .

set.seed(1)
# df.master
id.1= c("abc", "def", "ghi", "jkl")
val.1= c(1, 2, 3, 4)
id.2= c("mno", "pqr", "stu", "vwx")
val.2= c(5, 6, 7, 8)
val.other= rep(runif(1),4)
df.master= data.frame(id.1, id.2, val.other, val.1, val.2)

df.master looks like:

  id.1 id.2 val.other val.1 val.2
1  abc  mno 0.2655087     1     5
2  def  pqr 0.2655087     2     6
3  ghi  stu 0.2655087     3     7
4  jkl  vwx 0.2655087     4     8

I generate new data stored separately in a 2nd and 3rd data frame df.new.1 and df.new.2 .

df.new.1 looks like:

  id.3 val.3
1  abc    10
2  ghi    20
3  stu    30

# Create an 2nd data frame, which contains new values
id.3= c("abc", "ghi", "stu")
val.3= c(10, 20, 30)
df.new.1= data.frame(id.3, val.3)

df.new.2 looks like:

  id.4 val.4
1  def   100
2  vwx   200

# Create an 3rd data frame, which contains new values
id.4= c("def", "vwx")
val.4= c(100, 200)
df.new.2= data.frame(id.4, val.4)

I want to update df.master based on contents of df.new.1 and df.new.2 while keeping the original structure of df.master leading to following result:

  id.1 id.2 val.other val.1 val.2
1  abc  mno 0.2655087    10     5
2  def  pqr 0.2655087   100     6
3  ghi  stu 0.2655087    20    30
4  jkl  vwx 0.2655087     4   200

Please note that df.new.1 and df.new.2 contain a mix of new data matching id.1 and id.2 of df.master .

Any suggestions for code to perform the update of df.master ?

Something like the following could be helpful:

ids_mat = as.matrix(df.master[c("id.1", "id.2")])
mat_inds = arrayInd(match(df.new.1$id.3, ids_mat), dim(ids_mat))
df.master[c("val.1", "val.2")][mat_inds] <- df.new.1$val.3
df.master
#  id.1 id.2 val.other val.1 val.2
#1  abc  mno 0.2655087    10     5
#2  def  pqr 0.2655087     2     6
#3  ghi  stu 0.2655087    20    30
#4  jkl  vwx 0.2655087     4     8

Same logic for df.new.2 .

This code will not work perfectly (change both) if you have same id twice in your fist two columns of df.master

for(i in 1:length(df.new.1[,1])){
   tmp <- grep(pattern=df.new.1[i,1], x=df.master[,1])
   if(length(tmp)==1){  # if found anything
       df.master[tmp,4] <- df.new.1[i,2]
   }
   tmp <- grep(pattern=df.new.1[i,1], x=df.master[,2])
   if(length(tmp)==1){  # if found anything
       df.master[tmp,5] <- df.new.1[i,2]
   }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM