I need to look for similar Items in a list using python. (eg 'Limits' is similar to 'Limit' or 'Download ICD file' is similar to 'Download ICD zip file') I really want my results to be similar with chars, not with digits (eg 'Angle 1' is similar to 'Angle 2'). All these strings in my list end with an '\0'
What I am trying to do is split every item at blanks and look if any part consists of a digit. But somehow it is not working as I want it to work.
Here is my code example:
for k in range(len(split)): # split already consists of splitted list entry
replace = split[k].replace(
"\\0", ""
) # replace \0 at every line ending to guarantee it is only a digit
is_num = lambda q: q.replace(
".", "", 1
).isdigit() # lambda i found somewhere on the internet
check = is_num(replace)
if check == True: # break if it is a digit and split next entry of list
break
elif check == False: # i know, else would be fine too
seq = difflib.SequenceMatcher(a=List[i].lower(), b=List[j].lower())
if seq.ratio() > 0.9:
print(Element1, "is similar to", Element2, "\t")
break
Try this, its using get_close_matches
from difflib instead of sequencematcher
.
from difflib import get_close_matches
a = ["abc/0", "efg/0", "bc/0"]
b=[]
for i in a:
x = i.rstrip("/0")
b.append(x)
for i in range(len(b)):
print(get_close_matches(b[i], (b)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.