[英]Merge two dataframes by a partial match in R
I want to merge two data frames by name;我想按名称合并两个数据框; however, the names differ slightly between the two data frames.但是,两个数据框的名称略有不同。 Is there a way to merge these two data frames by a partial match?有没有办法通过部分匹配来合并这两个数据帧? I have tried answers to other posts but have not gotten the results I need.我尝试了其他帖子的答案,但没有得到我需要的结果。 Thanks谢谢
#Create data frames
df1 <- data.frame(
"Attending" = c("Kokabi, Nima", "Tong, Frank Charles","Devireddy, Chandan",
"Greenbaum, Adam B","Amin, Dina"),
"Outcome" = rep(1, times = 5),stringsAsFactors = F)
df2 <- data.frame(
"Credentialed" = c("Kokabi, Nima, MD","Tong, Frank Charles, MD",
"Devireddy, Chandanreddy M, MD", "Greenbaum, Adam Brett, MD",
"Amin, Dina, DDS"),
"Status" = rep("Active", times = 5),stringsAsFactors = F)
#Desired result
final <- data.frame(
"Attending" = c("Kokabi, Nima", "Tong, Frank Charles","Devireddy,
Chandan","Greenbaum, Adam B","Amin, Dina"),
"Outcome" = rep(1, times = 5),
"Credentialed" = c("Kokabi, Nima, MD","Tong, Frank Charles,
MD","Devireddy, Chandanreddy M, MD", "Greenbaum, Adam Brett, MD","Amin,
Dina, DDS"),
"Status" = rep("Active", times = 5)
)
head(final)
Here is a possible solution using grep
.这是使用grep
的可能解决方案。
df1$Credentialed <- grep(paste(df1$Attending,collapse = '|'),df2$Credentialed,value=T)
left_join(df1,df2)
Joining, by = "Credentialed"
Attending Outcome Credentialed Status
1 Kokabi, Nima 1 Kokabi, Nima, MD Active
2 Tong, Frank Charles 1 Tong, Frank Charles, MD Active
3 Devireddy, Chandan 1 Devireddy, Chandanreddy M, MD Active
4 Greenbaum, Adam B 1 Greenbaum, Adam Brett, MD Active
5 Amin, Dina 1 Amin, Dina, DDS Active
Note, I would suggest setting stringsAsFactors=F
in your data.frame call.请注意,我建议在 data.frame 调用中设置stringsAsFactors=F
。 And note how you pasted the names -- the return will be read by R, not as a space.并注意您是如何粘贴名称的——返回值将由 R 读取,而不是作为空格读取。
df1 <- data.frame(
"Attending" = c("Kokabi, Nima", "Tong, Frank Charles","Devireddy, Chandan",
"Greenbaum, Adam B","Amin, Dina"),
"Outcome" = rep(1, times = 5),stringsAsFactors = F)
df2 <- data.frame(
"Credentialed" = c("Kokabi, Nima, MD","Tong, Frank Charles, MD",
"Devireddy, Chandanreddy M, MD", "Greenbaum, Adam Brett, MD",
"Amin, Dina, DDS"),
"Status" = rep("Active", times = 5),stringsAsFactors = F)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.