简体   繁体   中英

Data structure for storing keywords and synonyms in C#?

I'm working on a project in C# that I need to store somewhere between 10 to 15 keywords and their synonyms for.

The first way I thought of to store these was using a 2d list something like List> so that it would look like:

keyword1 synonym1 synonym2

keyword2 synonym1

keyword3 synonym1 synonym2 etc.

What I started to think about was if i'm getting an input string and splitting it to search each word to see if its a keyword or a synonym of a keyword in the list will a 2d list be fine for this or will searching it be too slow?

Hopefully my question makes sense I can clarify anything if it's not clear just ask. Thanks!

will searching [the list] be too slow?

When you are talking about 10..15 keywords, it is hard to come up with an algorithm inefficient enough to make end-users notice the slowness. There's simply not enough data to slow down a modern CPU.

One approach would be to build a Dictionary<string,string> that maps every synonym to its "canonical" keyword. This would include the canonical version itself:

var keywords = new Dictionary<string,string> {
    ["keyword1"] = "keyword1"
,   ["synonym1"] = "keyword1"
,   ["synonym2"] = "keyword1"
,   ["keyword2"] = "keyword2"
,   ["synonym3"] = "keyword2"
,   ["keyword3"] = "keyword3"
};

Note how both keywords and synonyms appear as keys, while only keywords appear as values. This lets you look up a keyword or synonym, and get back a guaranteed keyword.

I would probably use a Dictionary. Where the key is your synonym and the value is your key word. So you do a look up in the Dictionary for any word and get the actual key word you want. For example:

private Dictionary<string, string> synonymKeywordDict = new Dictionary<string, string>();

public SearchResult Search(IEnumerable<string> searchTerms)
{
  var keywords = searchTerms.Select(x => synonymKeywordDict[x]).Distinct().ToList();
  //keywords now contains your key words after being translated from any synonyms
}

Just in case I'm not clear enough the Dictionary would be loaded like so.

private void LoadDictionary()
{
  //So our lookup doesn't fail on the key word itself.
  synonymKeywordDict.Add("computer", "computer");
  //Then all our synonyms
  synonymKeywordDict.Add("desktop", "computer");
  synonymKeywordDict.Add("PC", "computer");
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM