[英]Complicated VLOOKUP on Excel
我正在尝试分析我在网上商店中保存的同义词的效率。 我有一个大约 5000 个同义词的列表,并想在 Excel 的帮助下在 1.000.000 个查询的列表中查找它们。 问题是,在每个“同义词单元格”上,我可能有多个由空格分隔的同义词。 我想在查询字符串列表中找到这些同义词。 最终,每当有匹配项时,我希望在引用单元格“VLOOKUP”中找到附加到该同义词的产品,并将它们一起计算,看看我在同义词的帮助下获得了多少销售额。
希望这个解释不是太复杂,您可以帮助我在搜索查询中找到每个同义词。 如果您对如何更有效地执行此过程有更好的想法,那就更好了。 :-)
这是一些可以很好解释的示例数据,我要做什么: https://docs.google.com/spreadsheets/d/1UASfryBJ6pQiAqVy8Z6dJ1klJkUzCu4UZZCunjAIDFg/edit?usp=sharing
随意编辑它,非常感谢! 内斯
为了查找同义词(在工作表“同义词”的 B 列中找到,也许你必须像这样重构工作表“同义词”:
| synonym | product |
| -------- | -------|
| icecream | cake1 |
| sweets | cake1 |
之后,您需要将查询列拆分为多列(每列一个单词)并在重构的同义词工作表中查找每个单词
有了这么多数据,并使用 Excel,我建议使用 Power Query(在 Windows Excel 2010+ 和 O365 中可用)。
IgnoreCase
之类的操作并设置适当的Threshold
以允许单个单词和复数单词以及查询中的轻微拼写错误。您可能需要根据实际数据调整列拆分器和阈值等内容。 此外,我只允许使用一个同义词和产品。 列拆分器可以重写以处理任意数量的列,如果我今天晚些时候有时间我会看看
我试图评论 M 代码来解释事情
请注意代码第 4 行和第 18 行中的表名。 您可能需要更改这些(或者,如果从外部源读取它们,则完全更改该行)。
M代码
粘贴到 PQ 中的高级编辑器中
let
//Read in Query Table and convert to single column of words in query
Source = Excel.CurrentWorkbook(){[Name="tblQuery"]}[Content],
#"Changed Type1" = Table.TransformColumnTypes(Source,{{"query", type text}, {"bought", type text}}),
//add an index column for eventual reconstruction
#"Add Index" = Table.AddIndexColumn(#"Changed Type1","Index",0,1),
//may need more splits depending on real data
splitIt = Table.SplitColumn(#"Add Index", "query", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv),
{"query.1", "query.2", "query.3"}),
#"Unpivoted Other Columns1" = Table.UnpivotOtherColumns(splitIt, {"Index", "bought"}, "Attribute", "Value"),
queryTbl = Table.RemoveColumns(#"Unpivoted Other Columns1",{"Attribute"}),
//Read in synonym table
//unpivot to convert to two column table
Source2 = Excel.CurrentWorkbook(){[Name="tblSyno"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source2,{{"product", type text}, {"query-synonym", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "query-synonym", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"query-synonym.1", "query-synonym.2", "query-synonym.3"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"query-synonym.1", type text}, {"query-synonym.2", type text}, {"query-synonym.3", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type2", {"product"}, "Attribute", "Value"),
synoTbl = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
//combine tables based on synonyms
//combTbl = Table.NestedJoin(queryTbl,"Value",synoTbl,"Value","Joined",JoinKind.LeftOuter),
combTbl = Table.FuzzyNestedJoin(queryTbl,"Value",synoTbl,"Value","Joined",JoinKind.LeftOuter,
[IgnoreCase=true, Threshold=0.9]),
//extract the synonym
#"Added Custom" = Table.AddColumn(combTbl, "Synonym", each try Table.Column([Joined],"Value"){0}
otherwise null),
#"Added Custom4" = Table.AddColumn(#"Added Custom", "Attached Product", each try Table.Column([Joined],"product"){0}
otherwise null),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom4",{"Joined"}),
//Recombine by the Index column to recreate the query
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Index"}, {{"Group", each _, type table [bought=nullable text, Index=number, Value=text, Joined=table, Synonym=nullable text]}}),
#"Removed Columns1" = Table.RemoveColumns(#"Grouped Rows",{"Index"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns1", "query", each Table.Column([Group],"Value")),
#"Extracted Values" = Table.TransformColumns(#"Added Custom1",
{"query", each Text.Combine(List.Transform(_, Text.From), " "), type text}),
//extract the "bought" column from the group table
//if there might be more than one product in
//the "bought" column, need to change this
#"Added Custom2" = Table.AddColumn(#"Extracted Values", "bought", each
List.Distinct(Table.Column([Group],"bought")){0}),
//extract the Matched Synonym column
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "Matched Synonym", each List.RemoveNulls(Table.Column([Group],"Synonym")){0}),
//extract the Attached Product column
#"Added Custom5" = Table.AddColumn(#"Added Custom3", "Attached Product", each List.RemoveNulls(Table.Column([Group],"Attached Product")){0}),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom5",{"Group"})
in
#"Removed Columns2"
同义词表
结果
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.