簡體   English   中英

搜索包含特定單詞的短語的有效方法

[英]Efficient way to search for phrases that contain specific words

我需要最有效的方法在 C# 中做到這一點。

假設:

  1. 集合1: {"I am good", He is best", They are poor", "Mostly they are average", "All are very nice"}
  2. Collection2: {"good", "best" ,"nice"}

我想搜索所有Collection2在項目Collection1並存儲在匹配結果Collection3 ,所以Collection3會是這樣:

合集3: {"I am good", "I am best", "All are very nice"}

IList<String> Collection3;

for(int i = 0 ; i < Collectio2.Count ; i++)
{
   foreach(String str in Collection1)
   {
      if(str.Contains(Collection2[i]))
      {
         Collection3.Add(str);
      }
   }
}

最好的方法來做到這一點。

string[] Collection1 = {"I am good", "He is best", "They are poor", "Mostly they are average", "All are very nice"};
string[] Collection2 = { "good", "best", "nice" };

var Collection3 = Collection1.Select(x => x.ToLower())
                   .Where(x => Collection2.Any(y => x.Contains(y))).ToArray();

假設您的Collection2項目是該詞通常含義的詞[沒有雙關語],您可以使用 LINQ ToLookup - 這將為您提供適當的 MultiValueDictionary 模擬,並且您可以嘗試以下操作:

var phrases = new[] { "I am good", "He is best", "They are poor", "Mostly they are average", "All are very nice", "Not so\tgood \t", };

var lookup = phrases
    .Select((phrase, index) =>
        new
        {
            phrase,
            index,
            words = phrase.Split((Char[])null, StringSplitOptions.RemoveEmptyEntries)
        })
    .SelectMany(item =>
        item
            .words
            .Select(word =>
                new
                {
                    word,
                    item.index,
                    item.phrase
                }))
    .ToLookup(
        keySelector: item => item.word,
        elementSelector: item => new { item.phrase, item.index });

var wordsToSearch = new[] { "good", "best", "nice" };

var searchResults = wordsToSearch
    .Select(word =>
        new
        {
            word,
            phrases = lookup[word].ToArray()
        });

foreach (var result in searchResults)
{
    Console.WriteLine(
        "Word '{0}' can be found in phrases : {1}",
        result.word,
        String.Join(
            ", ",
            result
                .phrases
                .Select(phrase => 
                    String.Format("{0}='{1}'", phrase.index, phrase.phrase))));
}      

它為您提供索引和短語,因此您可以根據需要對其進行調整。

但是如果你的Collection2不是由單詞組成,而是由短語組成,那么你將需要更強大的東西,比如lucene.net ,可以正確處理全文搜索的東西。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM