C ++中的运行时位复制（位掩码）

Question

I have a problem at hand and solved it one way but I am not happy how I solved it as it doesn't work in every context. 我手头有一个问题并以一种方式解决了它，但是我不满意如何解决它，因为它在每种情况下都不起作用。 The solution has to be in C++(11). 解决方案必须在C ++（11）中。

I have a char array and an int. 我有一个char数组和一个int。 Given an bit-offset relative to data and a length (in bits). 给定相对于数据的位偏移和长度（以位为单位）。 I want to extract the bits from offset to offset+length from the array and store them in out. 我想从数组中提取从offset到offset + length的位，并将它们存储在外面。

char8_t data[8];
int32_t out;
int32_t offset;
int32_t length;

Figure with offset=24; length=4; offset=24; length=4; 图形 offset=24; length=4; offset=24; length=4;

Both the offset and the length are only available at run-time. 偏移量和长度都仅在运行时可用。 Hence, I would like to avoid creating bitmasks. 因此，我想避免创建位掩码。 I personally solved it by casting the complete array to int64_t and then right-shift by (64-offset-length) and left-shift by (64-length). 我亲自解决了此问题，方法是将整个数组转换为int64_t，然后右移（64-offset-length），然后左移（64-length）。

out = (*(int64_t*)data) >> (64-offset-length) << (64-length);

The issue: If my array would be longer, there would be no primitive to capture the complete array. 问题是：如果我的阵列更长，将没有原始数据来捕获整个阵列。 My solution wouldn't work anymore. 我的解决方案不再起作用。 Is there a better (scaling) way to do this? 有更好的（缩放）方法吗？

In an ideal world I could create a pointer with a bit offset, but this is C++ not an ideal world. 在理想的世界中，我可以创建一个具有一点偏移量的指针，但这不是C ++的理想世界。

Alternatives I thought about: Adding up bits with += on "out" by iterating through the array and left-shifting. 我考虑过的替代方法是：通过遍历数组并向左移动，在“ out”上加+ =。 Quite unelegant! 毫不客气！

I am aware that there are similar questions out there, but either they have been poorly answered or the answers have hefty performance implications. 我知道那里也有类似的问题，但是要么答案很差，要么对性能的影响很大。

Answer 1

First, your approach will depend on endianess, ie on whether the system stores the most significant bytes at the begin or at the end of the respective 8-byte memory block. 首先，您的方法将取决于字节序，即系统是在各个8字节存储块的开头还是结尾存储最高有效字节。

Second, I'd use unsigned data types, eg uchar8_t data[8] and uint32_t in order to correctly deal with bit shifts and (automatic) type promotions. 其次，我将使用无符号数据类型，例如uchar8_t data[8]和uint32_t ，以便正确处理位移和（自动）类型提升。

If you exactly know where in your data[8] a specific information is stored and in which order, you could write it as follows: 如果您确切知道data[8]的特定位置以及存储顺序，则可以按以下方式编写它：

uint32_t out = data[0] + 256*data[1]; 
...

Thereby, your "decoder" will be tightened to the order / meaning of the original data; 从而，您的“解码器”将被拉紧到原始数据的顺序/含义； your data may get longer than the largest integral data type; 您的data可能会比最大的整数数据类型更长； and you avoid undefined behaviour that might be introduced by shifting signed integral values over signed bits. 并且避免了因将带符号的整数值移到带符号的位上而导致的未定义行为。

If your offset is not a multiple of 8, ie the "value" does not start at the beginning of a byte, you can still use bit shift operations to correct this. 如果偏移量不是8的倍数，即“值”不是从字节的开头开始，则仍然可以使用移位操作来纠正此错误。 Let's assume that the value starts at an offset of 2 bits; 假设该值以2位的偏移量开始； Then you could write: 然后，您可以编写：

uint32_t out = (data[0] >> 2) + (data[1] << 6) + (data[2] << (6+8))

But - in the end - your target will be limited to a specific amount of bits, since the C language at your specific platform will guarantee a particular size for each of the primitive data types, unsigned long long probably being 64 bits still. 但是-最终-您的目标将被限制为特定的位数，因为您特定平台上的C语言将为每种原始数据类型保证特定的大小， unsigned long long可能仍为64位。 This limit is implementation defined, the standard guarantees a minimum of bits for each data type. 此限制是由实现定义的，标准保证每种数据类型的位数最少。 Whether this limit comes from registers or something else, you cannot know - its implementation defined. 这个限制是来自寄存器还是其他，您不知道-它的实现已定义。

Answer 2

Have you tried std::vector<bool> ? 您是否尝试过std::vector<bool> ？ It is a specialization of std::vector that combines the dynamic size of vector with compactness of std::bitset . 它是std::vector的std::vector ，它结合了std::vector的动态大小和std::bitset紧凑性。

Answer 3

I'd use bitset as a temporary. 我将使用bitset作为临时工具。 First copy byte aligned in a loop and then perform bit alignment. 首先将一个循环中的复制字节对齐，然后执行位对齐。

unsigned startbit = offset;
unsigned startbyte = startbit / 8;
unsigned endbit = offset + length - 1;
unsigned endbyte = endbit / 8;

bitset<8*(sizeof(out) + 1)> align(0);
for(unsigned byte = endbyte; byte >= startbyte; --byte) { // byte align copy
// for(unsigned byte = startbyte; byte <= endbyte; ++byte) { // check endianess
    align <<= 8;
    align |= data[byte];
}
align >>= startbit % 8; // bit align
align &= ((1 << length) - 1); // mask

out = align.to_ullong();

Answer 4

I use std::bitset and boost::dynamic_bitset to represent binary data and manipulate them. 我使用std::bitset和boost::dynamic_bitset表示二进制数据并对其进行操作。 std::bitset works well if the length is fixed, otherwise boost::dynamic_bitset is a good choice. 如果长度固定，则std::bitset效果很好，否则boost::dynamic_bitset是一个不错的选择。 With this, you can extract bits with overloaded bit operators: 这样，您可以使用重载的位运算符提取位：

#include <boost/dynamic_bitset.hpp>

using boost::dynamic_bitset;

dynamic_bitset<unsigned char> extract(unsigned char* first, unsigned char* last, int offset, int length) {
   dynamic_bitset<unsigned char> bits(first, last);

   bits >>= bits.size() - (offset  + length);
   bits.resize(length);

   return bits;
}

So instead of int32_t out; 因此，而不是int32_t out; you can use dynamic_bitset<> to hold values of arbitrary bit length in an efficient way. 您可以使用dynamic_bitset<>高效地保存任意位长度的值。

C ++中的运行时位复制（位掩码）

问题描述

4 个解决方案

解决方案1
1 已采纳 2018-04-10 09:44:02

解决方案2
1 2018-04-10 10:01:26

解决方案3
1 2018-04-10 11:16:25

解决方案4
0 2018-04-10 12:06:05

C ++中的运行时位复制（位掩码）

问题描述

4 个解决方案

解决方案1 1 已采纳 2018-04-10 09:44:02

解决方案2 1 2018-04-10 10:01:26

解决方案3 1 2018-04-10 11:16:25

解决方案4 0 2018-04-10 12:06:05

解决方案1
1 已采纳 2018-04-10 09:44:02

解决方案2
1 2018-04-10 10:01:26

解决方案3
1 2018-04-10 11:16:25

解决方案4
0 2018-04-10 12:06:05