简体   繁体   中英

Match and replace value using 2 Data Frames (R)

2 dfs, need to match "Name" with info$Name and replace corresponding values in details$Salary , df - details should retain all values and there should be no NAs(if match found replace the value if not found leave as it is)

details<- data.frame(Name = c("Aks","Bob","Caty","David","Enya","Fredrick","Gaby","Hema","Isac","Jaby","Katy"),
                     Age = c(12,22,33,43,24,67,41,19,25,24,32),
                     Gender = c("f","m","m","f","m","f","m","f","m","m","m"),
                     Salary = c(1500,2000,3.6,8500,1.2,1400,2300,2.5,5.2,2000,1265))

info <- data.frame(Name = c("caty","Enya","Dadi","Enta","Billu","Viku","situ","Hema","Ignu","Isac"),
                income = c(2500,5600,3200,1522,2421,3121,4122,5211,1000,3500))   

Expected Result :

Name      Age Gender Salary
Aks       12      f   1500
Bob       22      m   2000
Caty      33      m   2500
David     43      f   8500
Enya      24      m   5600
Fredrick  67      f   1400
Gaby      41      m   2300
Hema      19      f   5211
Isac      25      m   3500
Jaby      24      m   2000
Katy      32      m   1265     

None of the following is giving expected result

dplyr::left_join(details,info,by = "Name") 
dplyr::right_join(details,info,by = "Name") 
dplyr::inner_join(details,info, by ="Name") # for other matching and replace this works fine but not here
dplyr:: full_join(details,info,by ="Name")

All the results are giving NA's , tried using match function also but it is not giving desired result, any help would be highly appreciated

You have Name in both the dataframe in different cases, we need to first bring them in the same case then do a left_join with them and use coalesce to select the first non-NA value between income and salary .

library(dplyr)

details %>% mutate(Name = stringr::str_to_title(Name)) %>%
  left_join(info %>% mutate(Name = stringr::str_to_title(Name)), by = "Name") %>%
  mutate(Salary = coalesce(income, Salary)) %>%
  select(names(details))

#       Name Age Gender Salary
#1       Aks  12      f   1500
#2       Bob  22      m   2000
#3      Caty  33      m   2500
#4     David  43      f   8500
#5      Enya  24      m   5600
#6  Fredrick  67      f   1400
#7      Gaby  41      m   2300
#8      Hema  19      f   5211
#9      Isac  25      m   3500
#10     Jaby  24      m   2000
#11     Katy  32      m   1265

A base R solution:


matches <- match(tolower(details$Name), tolower(info$Name))
match <-  !is.na(matches)

details$Salary[match] <- info$income[matches[match]]

#Result
Name Age Gender Salary
1       Aks  12      f   1500
2       Bob  22      m   2000
3      Caty  33      m   2500
4     David  43      f   8500
5      Enya  24      m   5600
6  Fredrick  67      f   1400
7      Gaby  41      m   2300
8      Hema  19      f   5211
9      Isac  25      m   3500
10     Jaby  24      m   2000
11     Katy  32      m   1265

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM