简体   繁体   中英

String matching algorithm

Say I have 3 strings. And then 1 more string.
Is there an algorithm that would allow me to find which one of the first 3 strings matches the 4th string the most?
None of the strings are going to be exact matches, I'm just trying to find the closest match.
And if the algorithm already exists in STL, that would be nice.

Thanks in advance.

You don't specify what exactly you mean by "matches the most", so I assume you don't have precise requirements. In that case, Levenshtein distance in a reasonable metric. Simply compute the Levenshtein distance between each of the three strings and the fourth, and pick the one that gives the lowest distance.

You can implement the Levenshtein Distance algorithm, it provides a very nice measure of how close a match between two strings you have. It measures how many keystrokes you need to make in order to turn one string into the other. You can find a C++ implementation here .

Compute Levenshtein Distance between string #4 and the three strings that you have. Pick the string with the smallest distance.

There's nothing ready in the STL, but what you need is some kind of string metric.

You have approximate string matching problem. Depending on what kind of matching you want to perform, you will use different algorithm. There are many.. SOUNDEX , Jaro-Winkler , Levenstein Distance , metaphore... etc. Regarding STL, I don't know any functions that implement those algorithms, but you can take a look here for some soource using c++. Also, note that if you are getting your strings from a database, it is very likely that your database engine implements some of those algorithms (most likely SOUNDEX).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM