For two example dataframes:
df1 <- structure(list(name = c("Katie", "Eve", "James", "Alexander",
"Mary", "Barrie", "Harry", "Sam"), postcode = c("CB12FR", "CB12FR",
"NE34TR", "DH34RL", "PE46YH", "IL57DS", "IP43WR", "IL45TR")), .Names = c("name",
"postcode"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-8L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), postcode = structure(list(), class = c("collector_character",
"collector"))), .Names = c("name", "postcode")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
df2 <-structure(list(name = c("Katie", "James", "Alexander", "Lucie",
"Mary", "Barrie", "Claire", "Harry", "Clare", "Hannah", "Rob",
"Eve", "Sarah"), postcode = c("CB12FR", "NE34TR", "DH34RL", "DL56TH",
"PE46YH", "IL57DS", "RE35TP", "IP43WQ", "BH35OP", "CB12FR", "DL56TH",
"CB12FR", "IL45TR"), rating = c(1L, 1L, 1L, 2L, 3L, 1L, 4L, 2L,
2L, 3L, 1L, 4L, 2L)), .Names = c("name", "postcode", "rating"
), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-13L), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), postcode = structure(list(), class = c("collector_character",
"collector")), rating = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "postcode", "rating")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
I want to add an additional column to df1 which gives the ratings from df2. There may be multiple ratings for each postcode (which is why a straight merge wouldn't work.
I only want to merge the two dataframes WHEN the postcode AND the first 3 characters of the name are the same (providing these are unique in df1). for example, if there was a Katherine and Katie - (both with the same postcode), these wouldn't be merged
I am happy to have blanks where there is no merge.
Any ideas?
Wouldn't a simple join with multiple columns solve your problem? Something like,
df<-merge(x=df1,y=df2,by=c('name','postcode'),all.x=T)
Alternate solution in case the column names don't match,
df1$key<-paste(df1$name,df1$postcode,sep="_")
df2$key<-paste(df2$name,df2$postcode,sep="_")
df<-merge(x=df1,y=df2,by=c('key'),all.x=T)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.