简体   繁体   中英

How to determine which string in an array is most similar to a given string?

Given a string,

string name = "Michael";

I want to be able to evaluate which string in array is most similar:

string[] names = new[] { "John", "Adam", "Paul", "Mike", "John-Michael" };

I want to create a message for the user: "We couldn't find 'Michael', but 'John-Michael' is close. Is that what you meant?" How would I make this determination?

This is usually done using the Edit distance / Levenshtein distance by comparing which word is the closest based on the number of deletions, additions or changes required to transform one word into the other.

There's an article providing you with a generic implementation for C# here .

Here you have the results for your example using the Levenshtein Distance:

EditDistance["Michael",#]&/@{"John","Adam","Paul","Mike","John-Michael"}
{6,6,5,4,5}  

Here you have the results using the Smith-Waterman similarity test

SmithWatermanSimilarity["Michael",#]&/@{"John","Adam","Paul","Mike","John-Michael"}
{0.,0.,0.,2.,7.} 

HTH!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM