[英]How do apply fuzzy lookup in r data frames
I have 2 data frames.我有 2 个数据框。 1. df1 is having sales data with unstructured headers, coming from OLAP cube. 1. df1 有来自 OLAP 多维数据集的非结构化标题的销售数据。
df1 <- data.frame("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]"= c("FY18","FY19","FY20"), "[Measures].[USD]"=c(100,200,300))
names(df1) <- c("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]","[Measures].[USD]")
df2<- data.frame("RawHeaderName"=c("[Time].[Fiscal Year]","[Measures].[USD]"),"ReportDisplayName"=c("FiscalYear","USD"))
my requirement is when df2$RawHeaderName value matches (fuzzy matches) with df1 headers then i need to replace df1 headers with df2$ReportDisplayName value.我的要求是当 df2$RawHeaderName 值与 df1 标头匹配(模糊匹配)时,我需要用 df2$ReportDisplayName 值替换 df1 标头。 Final out should be like below.最终结果应该如下所示。
FinalOutput <- data.frame("FiscalYear" =c("FY18","FY19","FY20"),"USD"=c(100,200,300))
Please help me to solve the problem.请帮我解决问题。 I already tried with library("fuzzyjoin"),library("dplyr") libraries but no luck.我已经尝试过 library("fuzzyjoin"),library("dplyr") 库,但没有运气。
I think you're simply looking for names(df1) <- c('Fiscal Year', 'USD')
which modifies df1 to:我认为您只是在寻找将df1修改为的names(df1) <- c('Fiscal Year', 'USD')
:
Fiscal Year USD
1 FY18 100
2 FY19 200
3 FY20 300
After speeding some time, below code is helping me to solve 50% problem only when match criteria exists.加速一段时间后,以下代码仅在匹配条件存在时帮助我解决 50% 的问题。 Still need to explore on fuzzy match.在模糊匹配上还需要探索。
library("dplyr")图书馆(“dplyr”)
df1 <- data.frame("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]"= c("FY18","FY19","FY20"), "[Measures].[USD]"=c(100,200,300))
names(df1) <- c("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]","[Measures].[USD]")
df2<- data.frame("RawHeaderName"=c("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]","[Measures].[USD]"),"ReportDisplayName"=c("FiscalYear","USD"))
Extract_Headers <- (names(df1))
Extract_Headers <- data.frame("Headers"=as.character(Extract_Headers))
df2$RawHeaderName <- as.character(df2$RawHeaderName)
df2$ReportDisplayName <- as.character(df2$ReportDisplayName)
Cleansed_Headers <- Extract_Headers %>% inner_join (df2, by =c("Headers"="RawHeaderName"))
names(df1)<- Cleansed_Headers$ReportDisplay
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.