简体   繁体   中英

how to get the order in a list of the most similar string in python

I want to compare a string with a list of other string and get the most similar. I can do it with difflib in python. But, what I want to do is get the order in the list.

from difflib import get_close_matches

a = ['abcde', 'efghij', 'klmno']
b = 'cdefgh'
print get_close_matches(b, a)

That code will return ['efghij'] which is right. But, what if I want to get 1 instead, because a[1] = 'efghij' ?

and, how do I get similarity ratio? should I compute it again with SequenceMatcher(None, b, a).ratio() ?

This gives you the first occurrence:

>>> ['abcde', 'efghij', 'klmno'].index('efghij')
1

Mikes answer is correct, however if speed is necessary and you need many lookups then I would advice you to use a dict:

a_hash = dict(zip(a, range(len(a))))
a_hash['efghij'] # prints 1

I have never used difflib, but my guess is you would do the following:

import difflib
difflib.SequenceMatcher(None, b, a[1]).ratio()
# or
difflib.SequenceMatcher(None, b, a_hash[difflib.get_close_matches(b, a)]).ratio()
# both returns 0.66666
# presumably because both strings have de and 2/6 = 0.666

Is that what you want?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM