简体   繁体   English

C# 高级搜索列表

[英]C# advanced search in list

I have in-memory List of strings containing two items: 'product 1 max' and 'product 1 min'.我有包含两项的字符串的内存列表:“product 1 max”和“product 1 min”。 How I can search and get 'product 1 min' when user input is 'product min'?当用户输入为“产品分钟”时,我如何搜索并获取“产品 1 分钟”?

note there are some missing words in middle.请注意,中间有一些缺失的单词。

var list = new List<string> {"product 1 max", "product 1 min" };
//user input 'product min' and he expected 'product 1 min'

One way of doing it would be to split the input and match it against each word in the list of strings.一种方法是拆分输入并将其与字符串列表中的每个单词进行匹配。

var list = new List<string> { "product 1 max", "product 1 min" };
var input = "product min";
List<string> inputParts = input.Split(' ').ToList();

// contains all the input strings
List<string> results = list.Where(x => x.Split(' ').Intersect(inputParts).Count() == inputParts.Count).ToList();

// partial matching strings
List<string> partialMatches = list.Where(x => x.Split(' ').Intersect(inputParts).Count() > 0).ToList();

Documentation on the Intersect method can be found here可以在此处找到有关 Intersect 方法的文档

If you split the input into words, you can filter the list to the matches that contain all the input words:如果将输入拆分为单词,则可以将list过滤为包含所有输入单词的匹配项:

var inputWords = input.Split(' ');
var ans = list.Where(s => inputWords.All(s.Contains)).ToList();

NOTE: s.Contains is a shorter, more efficient (more obscure) way of doing w => s.Contains(w)注意: s.Contains是一种更短、更有效(更晦涩)的方式w => s.Contains(w)

A way of achieve that is use the Damerau-Levenshtein algorithm.一种实现方法是使用 Damerau-Levenshtein 算法。 This an algorithm that basically calculates how many changes would take to one string becomes equals to another, it can be implemented manually but it's kinda tricky and there is already a library ( SoftWx.Match ) that encapsulates the logic for you.这是一种基本上计算一个字符串需要多少更改的算法等于另一个,它可以手动实现,但有点棘手,并且已经有一个库( SoftWx.Match )为您封装逻辑。

SoftWx.Match has a static method called DamerauOSA(string value1, string value2) that returns a double between 0 and 1 telling how like the two strings are, mixing this with LINQ you can easily. SoftWx.Match 有一个名为DamerauOSA(string value1, string value2)的 static 方法,该方法返回一个介于 0 和 1 之间的double精度值,说明两个字符串的相似程度,将其与 LINQ 混合使用即可。

List<string> products = new List<string>()
{
    "product 1 max",
    "product 1 min"
};
var stringToCompare = "product min";

products.ForEach(x => Console.WriteLine($"Item {x} against {stringToCompare} has {Similarity.DamerauOSA(x, stringToCompare)} points of similarity"));

// 0.80 is an arbitrary number of how much "equality" you want from both strings
var filtered = products.Where(x => Similarity.DamerauOSA(x, stringToCompare) > 0.80).ToList();

Console.WriteLine("Filtered");
filtered.ForEach(x => Console.WriteLine(x));

Working example here这里的工作示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM