简体   繁体   English

在MySQL中对大位字符串执行按位运算?

[英]Performing bitwise operations on large bit strings in MySQL?

I've got a MySQL database with a large amount of 2048-bit binary strings (eg '0111001...0101'). 我有一个包含大量2048位二进制字符串(例如'0111001 ... 0101')的MySQL数据库。 One calculation I'll need is the Hamming Distance (the total count of 1's in the XOR'd result) of these strings compared to some externally generated bitstring. 我需要进行的一种计算是将这些字符串的汉明距离(异或结果中的总计数为1)与一些外部生成的位字符串进行比较。 In order to get an idea of how to write this query, I tried writing it for smaller bitstrings. 为了了解如何编写此查询,我尝试为较小的位字符串编写该查询。 Here's an example: 这是一个例子:

select BIT_COUNT(bin((b'0011100000') ^ (b'1111111111')))

The inner portion that computes the XOR works correctly, but BIT_COUNT returns strange results. 计算XOR的内部部分可以正常工作,但是BIT_COUNT返回奇怪的结果。 This example returns 14, which is longer than the string itself. 本示例返回14,它比字符串本身长。

So I have a few questions: 所以我有几个问题:

First, why is BIT_COUNT returning such strange results. 首先,为什么BIT_COUNT返回如此奇怪的结果。 Is it operating on a string rather than the binary string I'd like it to operate on? 它是在字符串上运行,而不是在我要对其运行的二进制字符串上运行吗? If so, how do I deal with this? 如果是这样,我该如何处理?

Second, notice that I'm casting (is that the right word here?) the strings as binary by prepending with a b. 其次,请注意,我通过在前面加上b来将字符串转换为二进制字符串(在这里正确吗?)。 How would I do this with column names and variables? 如何使用列名和变量执行此操作? Clearly I can't simply prepend ab to a variable name, and I can't insert a space between. 显然,我不能简单地在变量名前加上a,也不能在两者之间插入空格。 Any ideas? 有任何想法吗?

Thanks, 谢谢,

EDIT: So here's a solution to the first problem: 编辑:所以这是第一个问题的解决方案:

select BIT_COUNT(b'0011100000' ^ b'1111111111')

There seems to be a problem when using this for larger strings (2048 bits). 将其用于较大的字符串(2048位)时,似乎存在问题。 I tried: 我试过了:

select BIT_COUNT(b'001110...00011')

and it gives me results like 28, when the actual bitcount should be around 1024. If I remove the b, then it appears to max-out at 64. Any ideas on how to resolve this problem? 它给我的结果是28,而实际的位计数应该在1024左右。如果我删除b,则它看起来最大为64。关于如何解决此问题的任何想法?

Just remove bin function. 只需删除bin功能。 With it BIN_COUNT treats its argument as a chars string, not as a set of bits. BIN_COUNT它将其参数视为一个字符字符串,而不是一组位。 So 所以

select BIT_COUNT(b'0011100000' ^ b'1111111111')

will do the work 会做的工作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM