简体   繁体   English

两个二进制字符串之间的汉明距离不起作用

[英]Hamming distance between two binary strings not working

I found an interesting algorithm to calculate hamming distance on this site: 我发现了一个有趣的算法来计算这个站点的汉明距离:

def hamming2(x,y):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x) == len(y)
    count,z = 0,x^y
    while z:
        count += 1
        z &= z-1 # magic!
    return count

The point is that this algorithm only works on bit strings and I'm trying to compare two strings that are binary but they are in string format, like 关键是这个算法只适用于位串,我试图比较两个二进制字符串,但它们是字符串格式,如

'100010'
'101000'

How can I make them work with this algorithm? 如何使它们与此算法一起使用?

Implement it: 实施它:

def hamming2(s1, s2):
    """Calculate the Hamming distance between two bit strings"""
    assert len(s1) == len(s2)
    return sum(c1 != c2 for c1, c2 in zip(s1, s2))

And test it: 并测试它:

assert hamming2("1010", "1111") == 2
assert hamming2("1111", "0000") == 4
assert hamming2("1111", "1111") == 0

If we are to stick with the original algorithm, we need to convert the strings to integers to be able to use the bitwise operators. 如果我们要坚持使用原始算法,我们需要将字符串转换为整数以便能够使用按位运算符。

def hamming2(x_str, y_str):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x_str) == len(y_str)
    x, y = int(x_str, 2), int(y_str, 2)  # '2' specifies we are reading a binary number
    count, z = 0, x ^ y
    while z:
        count += 1
        z &= z - 1  # magic!
    return count

Then we can call it as follows: 然后我们可以这样称呼它:

print(hamming2('100010', '101000'))

While this algorithm is cool as a novelty, having to convert to a string likely negates any speed advantage it might have. 虽然这种算法很酷,但必须转换为字符串可能会抵消它可能具有的任何速度优势。 The answer @dlask posted is much more succinct. @dlask发布的答案更为简洁。

This is what I use to calculate the Hamming distance. 这就是我用来计算汉明距离的方法。
It counts the # of differences between equal length strings. 它计算相等长度字符串之间的差异数。

def hamdist(str1, str2):
    diffs = 0
    for ch1, ch2 in zip(str1, str2):
        if ch1 != ch2:
            diffs += 1
    return diffs

I think this explains well The Hamming distance between two strings 我认为这很好地解释了两个琴弦之间The Hamming distance

def hammingDist(s1, s2):
    bytesS1=bytes(s1, encoding="ascii")
    bytesS2=bytes(s2, encoding="ascii")
    diff=0
    for i in range(min(len(bytesS1),len(bytesS2))):
        if(bytesS1[i]^bytesS2[i]!=0):
            diff+=1
    return(diff)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM