简体   繁体   English

在128位小的流中找到重复的对称位模式

[英]Find a repeating symmetric bit pattern in a small stream of 128 bits

How can I quickly scan groups of 128 bits that are exact equal repeating binary patterns, such 010101... Or 0011001100...? 如何快速扫描完全相同的重复二进制模式的128位组,例如010101 ...或0011001100 ...?

I have a number of 128 bit blocks, and wish to see if they match the patterns where the number of 1s is equal to number of 0s, eg 010101.... Or 00110011... Or 0000111100001111... But NOT 001001001... 我有128个位块,希望查看它们是否与1s等于0s的模式匹配,例如010101 ....或00110011 ...或0000111100001111 ...但不是001001001。 ..

The problem is that patterns may not start on their boundary, so the pattern 00110011.. May begin as 0110011..., and will end 1 bit shifted also (note the 128 bits are not circular, so start doesn't join to the end) 问题在于模式可能不会在其边界处开始,因此模式00110011 ..可能以0110011 ...开头,并且也将移位1位(请注意128位不是圆形的,因此start不会加入到结束)

The 010101... Case is easy, it is simply 0xAAAA... Or 0x5555.... However as the patterns get longer, the permutations get longer. 010101 ...情况很简单,它只是0xAAAA ...或0x5555....。但是,随着模式变长,排列也会变长。 Currently I use repeating shifting values such as outlined in this question Fastest way to scan for bit pattern in a stream of bits but something quicker would be nice, as I'm spending 70% of all CPU in this routine. 目前,我使用重复移位值(例如本问题中概述的值) 来以最快的方式扫描位流中的位模式,但是更快的方法会更好,因为我将70%的CPU用于此例程。 Other posters have solutions for general cases but I am hoping the symmetric nature of my pattern might lead to something more optimal. 其他海报提供了针对一般情况的解决方案,但我希望我的图案的对称性质可能会导致更理想的效果。

If it helps, I am only interested in patterns up to 63 bits long, and most interested in the power of 2 patterns (0101... 00110011... 0000111100001111... Etc) while patterns such as 5 ones/5 zeros are present, these non power 2 sequences are less than 0.1%, so can be ignored if it helps the common cases go quicker. 如果有帮助,我只对最长为63位的模式感兴趣,并且对2种模式(0101 ... 00110011 ... 0000111100001111 ...等)的功能最感兴趣,而诸如5个/ 5零的模式是目前,这些非幂2序列小于0.1%,因此,如果它有助于加快常见情况的发生,则可以忽略不计。

Other constraints for a perfect solution would be small number of assembler instructions, no wildly random memory access (ie, large rainbow tables not ideal). 完美解决方案的其他限制条件是少量的汇编程序指令,没有狂野的随机存储器访问(即,大的彩虹表不理想)。

Edit. 编辑。 More precise pattern details. 更精确的图案细节。

I am mostly interested in the patterns of 0011 and 0000,1111 and 0000,0000,1111,1111 and 16zeros/ones and 32 zeros/ones (commas for readabily only) where each pattern repeats continuously within the 128 bits. 我对0011和0000,1111和0000,0000,1111,1111和16zeros / ones和32个0 / ones(仅可读性为逗号)的模式感兴趣,其中每个模式在128位内连续重复。 Patterns that are not 2,4,8,16,32 bits long for the repeating portion are not as interesting and can be ignored. 对于重复部分来说,长度不是2、4、8、16、32位的模式不太有趣,可以忽略。 ( eg 000111... ) (例如000111 ...)

The complexity for scanning is that the pattern may start at any position, not just on the 01 or 10 transition. 扫描的复杂性在于,图案可以在任何位置开始,而不仅仅是在01或10过渡上开始。 So for example, all of the following would match the 4 bit repeating pattern of 00001111... (commas every 4th bit for readability) (ellipses means repeats identically) 因此,例如,以下所有内容都将匹配4位重复模式00001111 ...(为便于阅读,每第4位以逗号表示)(省略号表示重复相同)

0000,1111.... Or 0001,1110... Or 0011,1100... Or 0111,1000... Or 1111,0000... Or 1110,0001... Or 1100,0011... Or 1000,0111 0000,1111 ....或0001,1110 ...或0011,1100 ...或0111,1000 ...或1111,0000 ...或1110,0001 ...或1100,0011 ...或1000,0111

Within the 128bits, the same pattern needs to repeat, two different patterns being present is not of interest. 在128位内,需要重复相同的模式,因此不存在两个不同的模式。 Eg this is NOT a valid pattern. 例如,这不是有效的模式。 0000,1111,0011,0011... As we have changed from 4 bits repeating to 2 bits repeating. 0000,1111,0011,0011 ...正如我们从4位重复变为2位重复一样。

I have already verified the number of 1s is 64, which is true for all power 2 patterns, and now need to identify how many bits make up the repeating pattern (2,4,8,16,32) and how much the pattern is shifted. 我已经验证了1的数量是64,这对于所有幂2模式都是正确的,现在需要确定组成重复模式的位数(2、4、8、16、32)以及该模式是多少移动。 Eg pattern 0000,1111 is a 4 bit pattern, shifted 0. While 0111,1000... Is a 4 bit pattern shifted 3. 例如,模式0000,1111是4位模式,移位0。而0111,1000 ...是4位模式,移位3。

Lets start with the case where the patterns do start on their boundary. 让我们从模式确实在其边界处开始的情况开始。 You can check the first bit and use it to determine your state. 您可以检查第一位并使用它来确定您的状态。 Then start looping through your block, check the first bit, increment a count, left shift and repeat until you find that you've gotten the opposite bit. 然后开始循环遍历您的块,检查第一位,增加计数,左移并重复直到发现相反的位。 You can now use this initial length as the bitset length. 现在,您可以使用此初始长度作为位集长度。 Reset the count to 1 then count the next set of opposite bits. 将计数重置为1,然后计数下一组相反的位。 When you switch, check the length against the initial length and error out if they're not equal. 切换时,请对照初始长度检查长度,如果不相等,请输出错误。 Here's a quick function - it seems to work as expected for chars, and it shouldn't be too hard to expand it to deal with blocks of 32 bytes. 这是一个快速的功能-看起来像chars一样可以正常工作,并且扩展它以处理32字节的块应该不难。

unsigned char myblock = 0x33;
unsigned char mask = 0x80, prod = 0x00;
int setlen = 0, count = 0, ones=0;

prod = myblock & mask;
if(prod == 0x80)
  ones = 1;

for(int i=0;i<8;i++){
  prod = myblock & mask;
  myblock = myblock << 1;
  if((prod == 0x80 && ones) || (prod == 0x00 && !ones)){
    count++;
  }else{
    if(setlen == 0) setlen = count;
    if(count != setlen){
      printf("Bad block\n");
      return -1;
    }
    count = 1;
    ones = ( ones == 1 ) ? 0 : 1;
  }
}

printf("Good block of with % repeating bits\n",setlen);
return setlen;

Now to deal with blocks where there's an offset, I'd suggest counting the number of bits until the first 'flip'. 现在要处理有偏移的块,我建议计算直到第一个“翻转”为止的位数。 Store this number, then run the above routine until you hit the last segment which should have length unequal to the rest of the sets. 存储此数字,然后运行上面的例程,直到找到最后一段,其长度应与其余集合不相等。 Add the initial bits to the last segment's length, and then you should be able to compare it with the size of the rest of the sets correctly. 将初始位添加到最后一段的长度,然后您应该能够将其与其余集合的大小正确比较。

This code is pretty small, and bit shifting through a buffer shouldn't require too much work on the CPU's part. 这段代码非常小,通过缓冲区进行位移位不需要在CPU方面做太多工作。 I'd be interested to see how this solution ends up performing against your current one. 我很想知道这种解决方案最终会如何与您当前的解决方案相提并论。

The Generic solution for this kind of problems is to create a good hashing function for the patterns and store each pattern in a hash map. 此类问题的通用解决方案是为模式创建良好的哈希函数,并将每个模式存储在哈希图中。 Once you have the hash map created for the patterns then try to lookup in the table using the input stream. 一旦为模式创建了哈希映射,然后尝试使用输入流在表中查找。 I don't have code yet but let me know if you are struck in code.. Please post it and I can work on it.. 我还没有代码,但是如果您被代码打扰了,请告诉我。.请发布它,我可以处理它。

The restriction of the pattern repeating it self all over the 128-stream makes the number of combinations limited and also the sequence will have properties making it easy to check: 模式在整个128个流中自行重复执行的限制使组合的数量受到限制,并且序列将具有易于检查的属性:

One needs to iteratively check if high and low parts are same; 需要反复检查高低部分是否相同; if they are opposites, check if that particular length contains consecutive ones. 如果它们是相反的,则检查该特定长度是否包含连续的长度。

 8-bit repeat at offset 3:  00011111 11100000 00011111 11100000
 ==> high and low 16 bits are the same
 00011111 11100000 ==> high and low parts are inverted.
 Not same, nor inverted means rejection of pattern.

At that point one needs to check if there's a sequence of ones -- add '1' to the left side and check if it's power of two: n==(n & -n) is the textbook check for that. 那时需要检查是否有一个序列-在左侧添加“ 1”并检查其是否为2的幂:n ==(n&-n)是教科书上的检查。

I've thought about making a state machine, so every next byte (out of 16) would advance its state and after some 16 state transitions you'd have the pattern identified. 我考虑过要制作状态机,因此每个下一个字节(16个字节中的一个)都会推进其状态,在大约16个状态转换之后,您将确定模式。 But that doesn't look very promising. 但这看起来不太有希望。 Data structures and logic look more complex. 数据结构和逻辑看起来更加复杂。

Instead, why not precompute all those 126 patterns (from 01 to 32 zeroes + 32 ones), sort them and perform binary search? 相反,为什么不预先计算所有这126个模式(从01到32个零+ 32个零),对它们排序并执行二进制搜索? That would give you at most 7 iterations of binary search. 这样一来,您最多可以获得7次二分搜索。 And you don't need to store all 16 bytes of every pattern as its halves are identical. 而且您不需要存储每个模式的所有16个字节,因为其一半是相同的。 That gives you 126*16/2=1008 bytes for the array of patterns. 这样就可以为模式数组提供126 * 16/2 = 1008字节。 You also need something like 2 bytes per pattern to store the length of zero (one) runs and the shift relative to whatever pattern you consider unshifted. 每个模式还需要2个字节来存储零(一个)游程的长度以及相对于您认为未移位的任何模式的移位。 That's a total of 126*(16/2+2)=1260 bytes of data (should be gentle on the data cache) and very simple and tiny binary search algorithm. 总共有126 *(16/2 + 2)= 1260个字节的数据(在数据缓存上应该是温和的)以及非常简单且很小的二进制搜索算法。 Basically, its just an improvement over the answer that you mentioned in the question . 基本上,它只是对您在问题中提到的答案的一种改进。

You might want to try switching to linear search after 4-5 iterations of binary search. 您可能希望在二进制搜索经过4-5次迭代后尝试切换到线性搜索。 That may give a small boost to the overall algorithm. 这可能会给整体算法带来很小的提升。

Ultimately, the winner is determined by testing/profiling. 最终,获胜者取决于测试/配置文件。 And that's what you should do, get a few implementations and compare them on the real data in the real system. 这就是您应该做的,获取一些实现并将它们与真实系统中的真实数据进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM