在匹配另一列中的值之后替换一列中的值

Question

My data looks something like this. 我的数据看起来像这样。 What i want to do now is replace the "Old ID" values by using matching values from the second table: First table is this, 我现在想做的是使用第二个表中的匹配值替换“旧ID”值：第一个表是这个，

      Old ID |   Usage 
       211         25          
       211         17          
       211         18         
       202         11          
       202         12          
       194         17          
       202         16          
       194         22          
       194         84          
       198         26

The second table with the matching values 第二个表具有匹配值

      Old ID |     ID 
       211         abf          
       202         rdg          
       194         ufe         
       198

The first table should be changed after replacing each value in the Old ID with the corresponding values in the second table. 将旧ID中的每个值替换为第二个表中的相应值后，应更改第一个表。 If the value in the ID column is missing or "NULL" then the replaced value in the first table should show as "N/A" The first table should now look like this, 如果ID列中的值丢失或为“ NULL”，则第一个表中的替换值应显示为“ N / A”。第一个表现在应如下所示，

      Old ID |   Usage 
       abf         25          
       abf         17          
       abf         18         
       rdg         11          
       rdg         12          
       ufe         17          
       rdg         16          
       ufe         22          
       ufe         84          
       n/a         26

I have around 2 million such entries. 我有大约200万个这样的条目。 Thanks a lot for you help 非常感谢您的帮助

Answer 1

Something like this? 像这样吗

df1 <- data.frame(old.id = c(211, 211, 211, 202, 194, 202, 198, 194), usage=c(20:27), stringsAsFactors = F)
df2 <- data.frame(old.id = c(211, 211, 212, 213, 202, 198), ID =  c("a", "a", "b", "c", "d", "e"), stringsAsFactors = F)


df1$old.id <- sapply(df1$old.id , (function(nm) { out <- df2[df2$old.id == nm, ]$ID; ifelse(length(out) > 0, out[1], NA) }))

df1

Answer 2

first merge the two tables then remove the duplicates as below: 首先合并两个表，然后删除重复项，如下所示：

  S=merge(df1,df2,by="Old_ID")
  S[!duplicated(S),c(3,2)]
      ID Usage
 1   ufe    17
 4   ufe    22
 7   ufe    84
 10 <NA>    26
 11  rdg    11
 14  rdg    12
 17  rdg    16
 20  abf    25
 23  abf    17
 26  abf    18

Answer 3

This can be solved with an update on join : 这可以通过join的更新来解决：

library(data.table)
setDT(DT1)[setDT(DT2), on = "Old_ID", Old_ID := ID][]

  Old_ID Usage 1: abf 25 2: abf 17 3: abf 18 4: rdg 11 5: rdg 12 6: ufe 17 7: rdg 16 8: ufe 22 9: ufe 84 10: NA 26

Data 数据

DT1 <- structure(list(Old_ID = c("abf", "abf", "abf", "rdg", "rdg", 
"ufe", "rdg", "ufe", "ufe", NA), Usage = c("25", "17", "18", 
"11", "12", "17", "16", "22", "84", "26")), .Names = c("Old_ID", 
"Usage"), row.names = c(NA, -10L), class = c("data.table", "data.frame"))

DT2 <- structure(list(Old_ID = c("211", "202", "194", "198"), ID = c("abf", 
"rdg", "ufe", NA)), .Names = c("Old_ID", "ID"), row.names = c(NA, 
-4L), class = c("data.table", "data.frame"))

在匹配另一列中的值之后替换一列中的值

问题描述

3 个解决方案

解决方案1
0 2017-08-29 23:51:57

解决方案2
0 2017-08-30 00:41:16

解决方案3
0 已采纳 2017-08-30 08:59:28

Data 数据

在匹配另一列中的值之后替换一列中的值

问题描述

3 个解决方案

解决方案1 0 2017-08-29 23:51:57

解决方案2 0 2017-08-30 00:41:16

解决方案3 0 已采纳 2017-08-30 08:59:28

Data 数据

解决方案1
0 2017-08-29 23:51:57

解决方案2
0 2017-08-30 00:41:16

解决方案3
0 已采纳 2017-08-30 08:59:28