简体   繁体   English

用c ++在N维的位数组中存储和访问单个位的最快方法是什么?

[英]What is the fastest way to store and access single bits in an N dimensional array of bits in c++?

I have a code in which I need to read and write single bits of data from and to a large array of bits (several megabytes in total) in a random fashion. 我有一个代码,其中我需要以随机方式从大量的位(总共几兆字节)中读取和写入数据。 Something similar to playing battleships with a N dimensional array. 类似于玩带有N维数组的战舰。

I suspect that a compact array would be faster, as it would keep some of the array in cache. 我怀疑紧凑的数组会更快,因为它将某些数组保留在缓存中。 On the other hand I know access time to an element of an array available as an array object is equivalent to access through a compile-time pointer value and element access time in a typical implementation of a std::vector is the same as element access through a run-time pointer value (slower). 另一方面,我知道对作为数组对象可用的数组元素的访问时间等效于通过编译时指针值进行的访问,并且在std :: vector的典型实现中,元素访问时间与元素访问相同通过运行时指针值(较慢)。 And I have no idea how bitsets and bitfields fit in all this. 而且我不知道位集和位域如何适合所有这些。

I don't need this code to be portable, just very fast (x86). 我不需要此代码即可移植,只需非常快(x86)。

There's no single answer for that as it will depend on the processor architecture (and the compiler). 没有唯一的答案,因为这将取决于处理器体系结构(和编译器)。

That said, a bit array is quite fast. 也就是说,位数组非常快。 You simply create it as a array of int s and then access the bits by selecting the correct int and extracts the correct bit. 您只需将其创建为一个int数组,然后通过选择正确的int来访问这些位并提取正确的位。 It will be compact, fast as long as your int has a power of two number of bits (32, 64 etc) - otherwise you might have to do a tradeof between compactness and speed (for example on a 36 bit processor you could chose speed and use only 32 bits per int). 只要您的int具有两个位数(32、64等)的幂,它就会紧凑,快速-否则,您可能不得不在紧凑性和速度之间进行权衡(例如,在36位处理器上,您可以选择速度并且每个int仅使用32位)。

The code in the compact case becomes (p[idx / BITS_PER_INT] >> (idx % BITS_PER_INT)) . 紧凑情况下的代码变为(p[idx / BITS_PER_INT] >> (idx % BITS_PER_INT)) For the fast case where BITS_PER_INT = 2 << SHIFT this is the same as (p[idx >> SHIFT] >> (idx & (BITS_PER_INT-1))) & 1 . 对于BITS_PER_INT = 2 << SHIFT的快速情况,这与(p[idx >> SHIFT] >> (idx & (BITS_PER_INT-1))) & 1

If you require more control of the storage of the data you could customize the layout to comply with your requirements (although if portability is no issue, this is also probably not an issue). 如果您需要对数据存储进行更多控制,则可以自定义布局以符合您的要求(尽管可移植性没有问题,但这也可能不是问题)。

Again as it's also implementation specific what's fastest I should probably mention std::vector<bool> which although not guaranteed to be as fast or compact as possible it's quite likely that it at least is one of them and probably a good trade-off between them if required. 再一次,因为它也是特定于实现的,所以最快的我可能应该提到std::vector<bool> ,尽管不能保证它尽可能快或紧凑,但很可能它至少是其中之一,并且可能在两者之间进行权衡如果需要的话。

After looking into how element access translates to assembler code I went ahead and implemented my own bit addressing method. 在研究了元素访问如何转换为汇编代码之后,我继续并实现了自己的位寻址方法。 I used 我用了

char array[n][n]...[n/8];

and created a lookup table 并创建了一个查询表

char lookup[8]={1,2,4,8,16,32,64,128};

and split the last array index in two using the 8 least significant bits to access the lookup table and used an binary or | 并使用8个最低有效位将最后一个数组索引一分为二,并使用二进制或| for writing bits and for reading I used binary and & to mask the bit and bitshifted by the same index I used to address the lookup table. 为了写位和读取,我使用了二进制和&来屏蔽该位,并按与查找表相同的索引进行了位移位。

the read part: 阅读部分:

bool result=(bool)(array[x][y]...[z>>8]&lookup[(char)(z&255)])>>((char)(z&255))

the write part: 写部分:

array[x][y]...[z>>8] |= lookup[(char)(z&255)] //writes 1

I am quite pleased with the performance and this should compile in to near minimal assembler code, but I don't have any solid proof. 我对性能感到非常满意,并且应该可以将其编译到几乎最少的汇编代码中,但是我没有任何可靠的证明。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM