简体   繁体   English

按LINQ C#出现单词对列表进行排序

[英]Sort List by occurrence of a word by LINQ C#

i have stored data in list like 我已将数据存储在列表中

 List<SearchResult> list = new List<SearchResult>();
 SearchResult sr = new SearchResult();
 sr.Description = "sample description";
 list.Add(sr);

suppose my data is stored in description field like 假设我的数据存储在描述字段中

"JCB Excavator - ECU P/N: 728/35700"
"Geo Prism 1995 - ABS #16213899"
"Geo Prism 1995 - ABS #16213899"
"Geo Prism 1995 - ABS #16213899"
"Wie man BBA reman erreicht"
"this test JCB"
"Ersatz Airbags, Gurtstrammer und Auto Körper Teile"

now i want to query the list with my search term like geo jcb 现在我想用我的搜索词查询列表,如geo jcb

if you look then the word geo has stored many times in the description field. 如果你看,那么geo这个词在描述字段中存储了很多次。 so i want to sort my list in such way that the word in search term found maximum that data will come first. 所以我想以这样的方式对我的列表进行排序,使搜索词中的单词最大化,数据将首先出现。 please help me to do so. 请帮我这样做。 thanks 谢谢

You can use string.Split and Enumerable.OrderByDescending with an anonymous type: 您可以将string.SplitEnumerable.OrderByDescending与匿名类型一起使用:

List<SearchResult> list = new List<SearchResult>() { 
    new SearchResult(){Description="JCB Excavator - ECU P/N: 728/35700"},
    new SearchResult(){Description="Geo Prism 1995 - ABS #16213899"},
    new SearchResult(){Description="Geo Prism 1995 - ABS #16213899"},
    new SearchResult(){Description="Geo Prism 1995 - ABS #16213899"},
    new SearchResult(){Description="Wie man BBA reman erreicht"},
    new SearchResult(){Description="this test JCB"},
    new SearchResult(){Description="Ersatz Airbags, Gurtstrammer und Auto Körper Teile"},
};

string[] searchTerms = new[]{"geo", "jcb"};
var results = 
    list.Select(sr => new { Searchresult = sr, Words = sr.Description.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries) })
        .OrderByDescending(x => x.Words.Count(w => searchTerms.Contains(w.ToLower())))
        .Select(x => x.Searchresult);

You could use a simple regular expression, just combine your search terms in the pattern with | 您可以使用简单的正则表达式,只需将模式中的搜索项与| :

var re = new Regex("geo|JCB",RegexOptions.IgnoreCase);

Then count the number of matches in your description: 然后计算描述中的匹配数:

Console.WriteLine(re.Matches(description).Count); // Outputs '5' in your example

You could order your list by this: 您可以通过以下方式订购列表:

searchResults.OrderByDescending(r => re.Matches(r).Count);

Live example: http://rextester.com/MMAT58077 实例: http//rextester.com/MMAT58077


Edit : According to your new question linked in the comments (and hopefully you'll update the details of this question and let the duplicate die) you wish to order the results so that the most common result shows up earlier on in the list of results. 编辑 :根据您在评论中链接的新问题(并希望您将更新此问题的详细信息并让副本死亡)您希望订购结果,以便最常见的结果显示在结果列表的前面。

To do this, you could first calculate the relevant weighting of each search phrase, and use this to order the results. 为此,您可以先计算每个搜索短语的相关权重,然后使用它来对结果进行排序。

Step1: Calculate the weighting by counting the total number of times each search word appears in the entire set of data: 步骤1:通过计算每个搜索词在整个数据集中出现的总次数来计算权重:

var wordsToFind = "Geo JCB".Split();
// find number of times each search phrase is found
var weights = wordsToFind.Select( w => new { 
         Word = w, 
         Weight = list.Where(x => x.Description.Contains(w)).Count() 
    } );

For the data in this question at the moment this givves the result: 对于此问题中的数据,此时可以得出结果:

GEO: 3
JCB: 2

So you want all the GEO results first, followed by JCB . 因此,您首先需要所有GEO结果,然后是JCB I guess a nice-to-have would be to have the first result be the one where GEO is mentioned most often. 我想一个不错的选择就是让第一个结果成为GEO最常被提及的结果。

Step2: Use the weightings calculated in step 1 to order the results of a search. 步骤2:使用步骤1中计算的权重来排序搜索结果。

var values = list.Select(x => new { 
      SearchResult = x, 
      Words = x.Description.Split(' ')
   })
   .Select(x => new { 
       SearchResult = x.SearchResult, 
       Weight = weights.Sum(w => x.Words.Contains(w.Word) ? w.Weight : 0)
   })
   .OrderByDescending(x => x.Weight)
   .Select(x => x.SearchResult);

Live example: http://rextester.com/SLH38676 实例: http//rextester.com/SLH38676

List<SearchResult> list = new List<SearchResult>() 
{ 
   new SearchResult { Description = "JCB Excavator - ECU P/N: 728/35700" },
   new SearchResult { Description = "Geo Prism 1995 - ABS #16213899" },
   new SearchResult { Description = "Geo Prism 1995 - ABS #16213899" },
   new SearchResult { Description = "Geo Prism 1995 - ABS #16213899" },
   new SearchResult { Description = "Wie man BBA reman erreicht" },
   new SearchResult { Description = "this test JCB" },
   new SearchResult { Description = "Ersatz Airbags, Gurtstrammer und Auto Körper Teile" }            
   };

   var wordsToFind = "Geo JCB".Split();
   var values = list.Select(x => new { SearchResult = x, Count = x.Description.Split(' ')
                                             .Where(c => wordsToFind .Contains(c)).Count() })
                    .OrderByDescending(x => x.Count)
                    .Select(x => x.SearchResult);
var results = db.Blogs.AsEnumerable()
    .Select(sr => new
    {
        Searchresult = sr,
        Words = Regex.Split(sr.Name, @"[^\S\r\n {1,}").Union(Regex.Split(sr.Name2, @"[^\S\r\n]{1,}"))
    })
    .OrderByDescending(x => x.Words.Count(w => {
        foreach (var item in searchTerms)
        {
            if(w.ToLower().Contains(item))
            {
                return true;
            }
        }
        return false;
    }))
    .Select(x => x.Searchresult);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM