[英]Sort List by occurrence of a word by LINQ C#
我已將數據存儲在列表中
List<SearchResult> list = new List<SearchResult>();
SearchResult sr = new SearchResult();
sr.Description = "sample description";
list.Add(sr);
假設我的數據存儲在描述字段中
"JCB Excavator - ECU P/N: 728/35700"
"Geo Prism 1995 - ABS #16213899"
"Geo Prism 1995 - ABS #16213899"
"Geo Prism 1995 - ABS #16213899"
"Wie man BBA reman erreicht"
"this test JCB"
"Ersatz Airbags, Gurtstrammer und Auto Körper Teile"
現在我想用我的搜索詞查詢列表,如geo jcb
如果你看,那么geo這個詞在描述字段中存儲了很多次。 所以我想以這樣的方式對我的列表進行排序,使搜索詞中的單詞最大化,數據將首先出現。 請幫我這樣做。 謝謝
您可以將string.Split
和Enumerable.OrderByDescending
與匿名類型一起使用:
List<SearchResult> list = new List<SearchResult>() {
new SearchResult(){Description="JCB Excavator - ECU P/N: 728/35700"},
new SearchResult(){Description="Geo Prism 1995 - ABS #16213899"},
new SearchResult(){Description="Geo Prism 1995 - ABS #16213899"},
new SearchResult(){Description="Geo Prism 1995 - ABS #16213899"},
new SearchResult(){Description="Wie man BBA reman erreicht"},
new SearchResult(){Description="this test JCB"},
new SearchResult(){Description="Ersatz Airbags, Gurtstrammer und Auto Körper Teile"},
};
string[] searchTerms = new[]{"geo", "jcb"};
var results =
list.Select(sr => new { Searchresult = sr, Words = sr.Description.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries) })
.OrderByDescending(x => x.Words.Count(w => searchTerms.Contains(w.ToLower())))
.Select(x => x.Searchresult);
您可以使用簡單的正則表達式,只需將模式中的搜索項與|
:
var re = new Regex("geo|JCB",RegexOptions.IgnoreCase);
然后計算描述中的匹配數:
Console.WriteLine(re.Matches(description).Count); // Outputs '5' in your example
您可以通過以下方式訂購列表:
searchResults.OrderByDescending(r => re.Matches(r).Count);
實例: http : //rextester.com/MMAT58077
編輯 :根據您在評論中鏈接的新問題(並希望您將更新此問題的詳細信息並讓副本死亡)您希望訂購結果,以便最常見的結果顯示在結果列表的前面。
為此,您可以先計算每個搜索短語的相關權重,然后使用它來對結果進行排序。
步驟1:通過計算每個搜索詞在整個數據集中出現的總次數來計算權重:
var wordsToFind = "Geo JCB".Split();
// find number of times each search phrase is found
var weights = wordsToFind.Select( w => new {
Word = w,
Weight = list.Where(x => x.Description.Contains(w)).Count()
} );
對於此問題中的數據,此時可以得出結果:
GEO: 3
JCB: 2
因此,您首先需要所有GEO
結果,然后是JCB
。 我想一個不錯的選擇就是讓第一個結果成為GEO
最常被提及的結果。
步驟2:使用步驟1中計算的權重來排序搜索結果。
var values = list.Select(x => new {
SearchResult = x,
Words = x.Description.Split(' ')
})
.Select(x => new {
SearchResult = x.SearchResult,
Weight = weights.Sum(w => x.Words.Contains(w.Word) ? w.Weight : 0)
})
.OrderByDescending(x => x.Weight)
.Select(x => x.SearchResult);
實例: http : //rextester.com/SLH38676
List<SearchResult> list = new List<SearchResult>()
{
new SearchResult { Description = "JCB Excavator - ECU P/N: 728/35700" },
new SearchResult { Description = "Geo Prism 1995 - ABS #16213899" },
new SearchResult { Description = "Geo Prism 1995 - ABS #16213899" },
new SearchResult { Description = "Geo Prism 1995 - ABS #16213899" },
new SearchResult { Description = "Wie man BBA reman erreicht" },
new SearchResult { Description = "this test JCB" },
new SearchResult { Description = "Ersatz Airbags, Gurtstrammer und Auto Körper Teile" }
};
var wordsToFind = "Geo JCB".Split();
var values = list.Select(x => new { SearchResult = x, Count = x.Description.Split(' ')
.Where(c => wordsToFind .Contains(c)).Count() })
.OrderByDescending(x => x.Count)
.Select(x => x.SearchResult);
var results = db.Blogs.AsEnumerable()
.Select(sr => new
{
Searchresult = sr,
Words = Regex.Split(sr.Name, @"[^\S\r\n {1,}").Union(Regex.Split(sr.Name2, @"[^\S\r\n]{1,}"))
})
.OrderByDescending(x => x.Words.Count(w => {
foreach (var item in searchTerms)
{
if(w.ToLower().Contains(item))
{
return true;
}
}
return false;
}))
.Select(x => x.Searchresult);
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.