简体   繁体   中英

Levenshtein-distance algorithm

def worddistance(source, target):
    ''' Return the Levenshtein  distance between 2 strings '''  

    if len(source) > len(target):
        source, target = target, source
    #Now target becomes the larger string, if it is 0, surely len(source) is 0?
    if len(target) == 0: 
        return len(source)

  ### Continue on to calculate distance.

Isn't it the same as saying if both the parameters are the same, return 0?

I am not exactly sure what this part of the function is trying to achieve

Yes, the code returns 0 if both are length 0. You can see almost the same style in the Wikibooks implementation ; but the coder here simply hasn't thought the code through.

You can simply change that second test to:

if not target:
    return 0

and not change the meaning.

The Wikibooks implementation tests source however:

if not source:
    return len(target)

which makes much more sense.

The function would do more work after that line; it is merely a boundaries check. With the check gone, the algorithm would still work just less efficiently; the Wikibooks version would produce a series of 1-element lists ranging from [1] through to [len(target)] then return that last element; so len(target) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM