简体   繁体   中英

Calculate the difference between 2 strings (Levenshtein distance)

I am trying to calculate the distance between two strings. The distance/difference between two strings refers to the minimum number of character insertions, deletions, and substitutions required to change one string to the other.

The method I have tried is to: convert two strings into lists, compare lists, check the differences, then add the differences

first_string = "kitten"
second_string = "sitting"

list_1 = list(first_string)
list_2 = list(second_string)

print("list_1 = ", list_1)
print("list_2 = ", list_2)
print("   ")


lengths =  len(list_2) - len(list_1)
new_list = set(list_1) - set(list_2)
print(lengths)
print(new_list)

difference = lengths + int(new_list)
print(difference)

the output I get is:

list_1 =  ['k', 'i', 't', 't', 'e', 'n']
list_2 =  ['s', 'i', 't', 't', 'i', 'n', 'g']

1
{'e', 'k'}

Of which then I am trying to find out how to add these differences so it equals 3. I don't know how to make the outputs similar to add them together (adding 1 with {'e', 'k'} to equal a distance of 3).

You're almost there. Calculate the length of new_list using len() like you did with lengths:

difference = lengths + len(new_list)

Looks like you just need to change this line:

difference = lengths + int(len(new_list))

That should give you 3 like you want :)

This is referred to as the Levenshtein distance. Check out this implementation as further reading.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM