[英]How to bitwise operate on memory block (C++)
Is there a better (faster/more efficient) way to perform a bitwise operation on a large memory block than using a for loop?有没有比使用 for 循环更好(更快/更有效)的方法来对大内存块执行按位操作? After looking it to options I noticed that std has a member
std::bitset
, and was also wondering if it would be better (or even possible) to convert a large region of memory into a bitset without changing its values, then perform the operations, and then switch its type back to normal?在查看选项后,我注意到 std 有一个成员
std::bitset
,并且还想知道是否将大内存区域转换为位集而不更改其值是否会更好(甚至可能),然后执行操作,然后将其类型切换回正常?
Edit / update: I think union
might apply here, such that the memory block is allocated a new
array of int
or something and then manipulated as a large bitset
.编辑/更新:我认为
union
可能在这里适用,这样内存块被分配一个new
的int
数组或其他东西,然后作为一个大的bitset
操作。 Operations seem to be able to be done over the entire set based on what is said here: http://www.cplusplus.com/reference/bitset/bitset/operators/ .根据这里所说的,操作似乎可以在整个集合上完成: http ://www.cplusplus.com/reference/bitset/bitset/operators/。
In general, there is no magical way faster than a for loop.一般来说,没有比 for 循环更快的神奇方法了。 However, you can make it easier for the compiler to optimize the loop by keeping a few things in mind:
但是,您可以通过记住以下几点来使编译器更容易优化循环:
C99 example of xoring memory with a constant byte, assuming long long is 128-bit, the start of the buffer is aligned to 16 bytes, and without considering point 3. Bitwise operations on two memory buffers are very similar. C99 用常量字节异或内存的例子,假设 long long 是 128 位,缓冲区的开始对齐到 16 字节,不考虑第 3 点。两个内存缓冲区的按位运算非常相似。
size_t len = ...;
char *buffer = ...;
size_t const loadd_per_i = 4
size_t iters = len / sizeof(long long) / loads_per_i;
long long *ptr = (long long *) buffer;
long long xorvalue = 0x5e5e5e5e5e5e5e5e5e5e5e5e5e5e5e5eLL;
// run in multiple threads if there are more than 4 MB to xor
#pragma omp parallel for if(iters > 65536)
for (size_t i = 0; i < iters; ++i) {
size_t j = loads_per_i*i;
ptr[j ] ^= xorvalue;
ptr[j+1] ^= xorvalue;
ptr[j+2] ^= xorvalue;
ptr[j+3] ^= xorvalue;
}
// finish long longs which don't align to 4
for (size_t i = iters * loads_per_i; i < len / sizeof(long long); ++i) {
ptr[i] ^= xorvalue;
}
// finish bytes which don't align to long
for (size_t i = (len / sizeof(long long)) * sizeof(long long); i < len; ++i) {
buffer[i] ^= xorvalue;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.