如何在 r 数据框中应用模糊查找

Question

I have 2 data frames.我有 2 个数据框。 1. df1 is having sales data with unstructured headers, coming from OLAP cube. 1. df1 有来自 OLAP 多维数据集的非结构化标题的销售数据。

df1 <- data.frame("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]"= c("FY18","FY19","FY20"), "[Measures].[USD]"=c(100,200,300))
names(df1) <- c("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]","[Measures].[USD]")

df2 is having list of unstructured headers and respective cleansed headers. df2 具有非结构化标题列表和相应的清理标题。

df2<- data.frame("RawHeaderName"=c("[Time].[Fiscal Year]","[Measures].[USD]"),"ReportDisplayName"=c("FiscalYear","USD"))

my requirement is when df2$RawHeaderName value matches (fuzzy matches) with df1 headers then i need to replace df1 headers with df2$ReportDisplayName value.我的要求是当 df2$RawHeaderName 值与 df1 标头匹配（模糊匹配）时，我需要用 df2$ReportDisplayName 值替换 df1 标头。 Final out should be like below.最终结果应该如下所示。

FinalOutput <- data.frame("FiscalYear" =c("FY18","FY19","FY20"),"USD"=c(100,200,300))

Please help me to solve the problem.请帮我解决问题。 I already tried with library("fuzzyjoin"),library("dplyr") libraries but no luck.我已经尝试过 library("fuzzyjoin"),library("dplyr") 库，但没有运气。

Answer 1

I think you're simply looking for names(df1) <- c('Fiscal Year', 'USD') which modifies df1 to:我认为您只是在寻找将df1修改为的names(df1) <- c('Fiscal Year', 'USD') ：

  Fiscal Year USD
1        FY18 100
2        FY19 200
3        FY20 300

Answer 2

After speeding some time, below code is helping me to solve 50% problem only when match criteria exists.加速一段时间后，以下代码仅在匹配条件存在时帮助我解决 50% 的问题。 Still need to explore on fuzzy match.在模糊匹配上还需要探索。

library("dplyr")图书馆（“dplyr”）

df1 <- data.frame("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]"= c("FY18","FY19","FY20"), "[Measures].[USD]"=c(100,200,300))
names(df1) <- c("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]","[Measures].[USD]")


df2<- data.frame("RawHeaderName"=c("[Time].[Fiscal Year].[Fiscal Year].[MEMBER_CAPTION]","[Measures].[USD]"),"ReportDisplayName"=c("FiscalYear","USD"))


Extract_Headers <- (names(df1))
Extract_Headers <- data.frame("Headers"=as.character(Extract_Headers))
df2$RawHeaderName <- as.character(df2$RawHeaderName)
df2$ReportDisplayName <- as.character(df2$ReportDisplayName)
Cleansed_Headers <- Extract_Headers %>% inner_join (df2, by =c("Headers"="RawHeaderName"))
names(df1)<- Cleansed_Headers$ReportDisplay

如何在 r 数据框中应用模糊查找

问题描述

2 个解决方案

解决方案1
0 2020-03-04 14:35:12

解决方案2
0 2020-03-05 05:32:22

如何在 r 数据框中应用模糊查找

问题描述

2 个解决方案

解决方案1 0 2020-03-04 14:35:12

解决方案2 0 2020-03-05 05:32:22

解决方案1
0 2020-03-04 14:35:12

解决方案2
0 2020-03-05 05:32:22