简体   繁体   中英

Inbuilt Python Function for String Comparison like N-gram

Is there is any inbuilt function in Python Which performs like Ngram.Compare('text','text2') String Comparison.I don't want to install N-gram module.I tried all the Public and Private Functions which i got by doing dir('text')

I want to get a percentage Match on comparison of two strings.

You want the Levenshtein distance which is implemented through

http://pypi.python.org/pypi/python-Levenshtein/

Not wanting to install something means: you have to write the code yourself.

http://en.wikipedia.org/wiki/Levenshtein_distance

difflib in the standard library.

You can also do a Levenshtein distance:

def lev(seq1, seq2):
    oneago = None
    thisrow = range(1, len(seq2) + 1) + [0]
    for x in xrange(len(seq1)):
        twoago, oneago, thisrow = oneago, thisrow, [0] * len(seq2) + [x + 1]
        for y in xrange(len(seq2)):
            delcost = oneago[y] + 1
            addcost = thisrow[y - 1] + 1
            subcost = oneago[y - 1] + (seq1[x] != seq2[y])
            thisrow[y] = min(delcost, addcost, subcost)
    return thisrow[len(seq2) - 1]

def di(seq1,seq2):
    return float(lev(seq1,seq2))/min(len(seq1),len(seq2))

print lev('spa','spam')
print di('spa','spam')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM