[英]Match and replace value using 2 Data Frames (R)
2 dfs, need to match "Name" with info$Name and replace corresponding values in details$Salary , df - details should retain all values and there should be no NAs(if match found replace the value if not found leave as it is) 2 个 dfs,需要将 "Name" 与 info$Name 匹配并替换 details$Salary 中的相应值,df - details 应保留所有值并且不应有 NAs(如果找到匹配,则替换值,如果找不到则保持原样)
details<- data.frame(Name = c("Aks","Bob","Caty","David","Enya","Fredrick","Gaby","Hema","Isac","Jaby","Katy"),
Age = c(12,22,33,43,24,67,41,19,25,24,32),
Gender = c("f","m","m","f","m","f","m","f","m","m","m"),
Salary = c(1500,2000,3.6,8500,1.2,1400,2300,2.5,5.2,2000,1265))
info <- data.frame(Name = c("caty","Enya","Dadi","Enta","Billu","Viku","situ","Hema","Ignu","Isac"),
income = c(2500,5600,3200,1522,2421,3121,4122,5211,1000,3500))
Expected Result :预期结果 :
Name Age Gender Salary
Aks 12 f 1500
Bob 22 m 2000
Caty 33 m 2500
David 43 f 8500
Enya 24 m 5600
Fredrick 67 f 1400
Gaby 41 m 2300
Hema 19 f 5211
Isac 25 m 3500
Jaby 24 m 2000
Katy 32 m 1265
None of the following is giving expected result以下均未给出预期结果
dplyr::left_join(details,info,by = "Name")
dplyr::right_join(details,info,by = "Name")
dplyr::inner_join(details,info, by ="Name") # for other matching and replace this works fine but not here
dplyr:: full_join(details,info,by ="Name")
All the results are giving NA's , tried using match function also but it is not giving desired result, any help would be highly appreciated所有结果都给出了 NA's ,也尝试使用 match 函数但它没有给出想要的结果,任何帮助将不胜感激
You have Name
in both the dataframe in different cases, we need to first bring them in the same case then do a left_join
with them and use coalesce
to select the first non-NA value between income
and salary
.在不同情况下,您在两个数据
left_join
都有Name
,我们需要首先将它们放在同一个案例中,然后对它们进行left_join
并使用coalesce
来选择income
和salary
之间的第一个非 NA 值。
library(dplyr)
details %>% mutate(Name = stringr::str_to_title(Name)) %>%
left_join(info %>% mutate(Name = stringr::str_to_title(Name)), by = "Name") %>%
mutate(Salary = coalesce(income, Salary)) %>%
select(names(details))
# Name Age Gender Salary
#1 Aks 12 f 1500
#2 Bob 22 m 2000
#3 Caty 33 m 2500
#4 David 43 f 8500
#5 Enya 24 m 5600
#6 Fredrick 67 f 1400
#7 Gaby 41 m 2300
#8 Hema 19 f 5211
#9 Isac 25 m 3500
#10 Jaby 24 m 2000
#11 Katy 32 m 1265
A base R solution:一个基本的 R 解决方案:
matches <- match(tolower(details$Name), tolower(info$Name))
match <- !is.na(matches)
details$Salary[match] <- info$income[matches[match]]
#Result
Name Age Gender Salary
1 Aks 12 f 1500
2 Bob 22 m 2000
3 Caty 33 m 2500
4 David 43 f 8500
5 Enya 24 m 5600
6 Fredrick 67 f 1400
7 Gaby 41 m 2300
8 Hema 19 f 5211
9 Isac 25 m 3500
10 Jaby 24 m 2000
11 Katy 32 m 1265
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.