简体   繁体   中英

How to check if 2 words have the same base or stem?

I'm trying to merge words that have the same base. Example:

  • accident
  • accidental
  • accidentally
  • accidents

or

  • abandon
  • abandoned
  • abandoning

At first I used the

 Word.Application().SynonymInfo[myWord, Word.WdLanguageID.wdEnglishUS];

to get synonyms of a word from word.dll . But I realized that I don't wanna merge only synonyms but words with the same base.

Is there any function I could use from word.dll or any dll that would return if 2 words have the same base?

You are probably looking for Inflector which is an open source library.

It is made .Net 3.5 compatible

Here is a sample code for it.

English language has lot of exceptions but taking care of few most common scenarios using your own little function will take care of 90% cases.

It seems to be there are few common Scenarios:

a) Past tense : by adding suffix "ed"

b) Plurals : by adding "s", "es",

c) Common Suffix for making adjective :

d) Common Suffix for Adverb

e) Common Suffix used for converting verb to noun

So, by removing common suffix from words, we can try to merge the words which results in same base.

For not so common scenarios, may be, we can some string similar Algorithm to know if strings are similar are not. like using Levenshtein distance implementation:

using LINQ

Please see the following stackoverflow question also :

Are there any Fuzzy Search or String Similarity Functions libraries written for C#?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM