[英]inner join on two dataframes based on an exact match for one column and fuzzy match for two columns
I'd like to perform an exact match on one of my columns (Product_date) followed with a partial match or fuzzy match for product_name and state_name.我想对我的一列 (Product_date) 执行精确匹配,然后对 product_name 和 state_name 进行部分匹配或模糊匹配。
For example:例如:
df1 <- data.frame(ID=c("P01", "P04", "P23"),
Product_name=c("Jewel", "Bronze", "Iron"),
Product_state=c("Kansas", "Illinois", "Florida"),
Product_date=c("2021-08-01", "2021-01-01", "2020-12-21"))
df2 <- data.frame(
Product_name=c("Jewel", "Bro", "Ir", "Uknw"),
Product_state=c("Kansasss", "IllI", "Flor_ida", "Cali2"),
Product_date=c("2021-08-01", "2021-01-01", "2020-12-21", "2020-09"),
Product_status=c("sold", "lost", "sold", "sold"))
desired_df <- data.frame(c("P01", "P04", "P23"),
Product_name=c("Jewel", "Bronze", "Iron"),
Product_state=c("Kansas", "Illinois", "Florida"),
Product_date=c("2021-08-01", "2021-01-01", "2020-12-21"),
Product_name=c("Je", "Bro", "Ir"),
Product_state=c("Kansasss", "IllI", "Flor_ida"),
Product_date=c("2021-08-01", "2021-01-01", "2020-12-21"),
Product_status=c("sold", "lost", "sold"))
Just for illustrative purposes this is what the code in my head looks like (but of course it doesn't work)仅出于说明目的,这就是我脑海中的代码的样子(但当然它不起作用)
matched <- df1 %>%
stringdist_inner_join(df2, by= c("Product_name", max_dist=2),
by= c("Product_stat", max_dist=4),
by = c("Product_date"))
A possible solution:一个可能的解决方案:
library(fuzzyjoin)
library(dplyr)
stringdist_join(df1, df2,
by = c("Product_name","Product_state"),
mode = "left",
ignore_case = FALSE,
method = "jw",
max_dist = 0.5) %>%
filter(Product_date.x == Product_date.y)
#> ID Product_name.x Product_state.x Product_date.x Product_name.y
#> 1 P01 Jewel Kansas 2021-08-01 Jewel
#> 2 P04 Bronze Illinois 2021-01-01 Bro
#> 3 P23 Iron Florida 2020-12-21 Ir
#> Product_state.y Product_date.y Product_status
#> 1 Kansasss 2021-08-01 sold
#> 2 IllI 2021-01-01 lost
#> 3 Flor_ida 2020-12-21 sold
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.