In levenshtein distance you ask the question, given these two strings, what is their levenshtein distance. How would you go about taking a string and a levenshtein distance and generating all the strings within that levenshtein distance. (It would also take in a character set). So if i pass in a string x and a distance d. then it would give me all the strings within that edit distance, including d-1 and d-2....dn; (n < d).
Expected functionality:
>>> getWithinDistance('apple',2,{'a','b',' '})
['applea','appleb','appel','app le'...]
Please note that the program is able to produce app le
as space is included in the character set.
There's a data structure that does this called the Levenshtein automaton . You construct it from a set of strings (which may have only one member) and a fixed distance k , and then you can query it for all strings with distance at most k of any of the strings it stores. A Python implementation is discussed here .
Alternatively, you can do a depth-limited search with backtracking for such strings.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.