简体   繁体   中英

Find Similar Elements in List using Python

I need to look for similar Items in a list using python. (eg 'Limits' is similar to 'Limit' or 'Download ICD file' is similar to 'Download ICD zip file') I really want my results to be similar with chars, not with digits (eg 'Angle 1' is similar to 'Angle 2'). All these strings in my list end with an '\0'

What I am trying to do is split every item at blanks and look if any part consists of a digit. But somehow it is not working as I want it to work.

Here is my code example:

for k in range(len(split)):  # split already consists of splitted list entry
    replace = split[k].replace(
        "\\0", ""
    )  # replace \0 at every line ending to guarantee it is only a digit
    is_num = lambda q: q.replace(
        ".", "", 1
    ).isdigit()  # lambda i found somewhere on the internet
    check = is_num(replace)
    if check == True:  # break if it is a digit and split next entry of list
        break
    elif check == False:  # i know, else would be fine too
        seq = difflib.SequenceMatcher(a=List[i].lower(), b=List[j].lower())
        if seq.ratio() > 0.9:
            print(Element1, "is similar to", Element2, "\t")
            break

Try this, its using get_close_matches from difflib instead of sequencematcher .

from difflib import get_close_matches
a = ["abc/0", "efg/0", "bc/0"]
b=[]
for i in a:
    x = i.rstrip("/0")
    b.append(x)

for i in range(len(b)):
        print(get_close_matches(b[i], (b)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM