给定一个uint8_t数组，提取位的任何子序列作为uint32_t的好方法是什么？

Question

I have run into an interesting problem lately: 我最近遇到了一个有趣的问题：

Lets say I have an array of bytes (uint8_t to be exact) of length at least one. 可以说我有一个字节数组（确切地说是uint8_t），长度至少为一个。 Now i need a function that will get a subsequence of bits from this array, starting with bit X (zero based index, inclusive) and having length L and will return this as an uint32_t. 现在，我需要一个函数，该函数将从该数组中获取位的子序列，从位X（基于零的索引，包括零在内）开始，长度为L，并将其返回为uint32_t。 If L is smaller than 32 the remaining high bits should be zero. 如果L小于32，则剩余的高位应为零。

Although this is not very hard to solve, my current thoughts on how to do this seem a bit cumbersome to me. 尽管这并不是很难解决的，但是我目前对如何执行此操作的想法对我来说似乎有点麻烦。 I'm thinking of a table of all the possible masks for a given byte (start with bit 0-7, take 1-8 bits) and then construct the number one byte at a time using this table. 我正在考虑一个给定字节的所有可能掩码的表（从0-7位开始，取1-8位），然后使用该表一次构造一个字节。

Can somebody come up with a nicer solution? 有人可以提出更好的解决方案吗？ Note that i cannot use Boost or STL for this - and no, it is not a homework, its a problem i run into at work and we do not use Boost or STL in the code where this thing goes. 请注意，我不能为此使用Boost或STL-不，这不是一项家庭作业，这是我在工作中遇到的一个问题，我们在使用此功能的代码中不使用Boost或STL。 You can assume that: 0 < L <= 32 and that the byte array is large enough to hold the subsequence. 您可以假定：0 <L <= 32，并且字节数组足够大以容纳子序列。

One example of correct input/output: 正确输入/输出的一个示例：

array: 00110011 1010 1010 11110011 01 101100 阵列：00110011 1010 1010 11110011 01 101100
subsequence: X = 12 (zero based index), L = 14 子序列：X = 12（从零开始的索引），L = 14
resulting uint32_t = 00000000 00000000 00 101011 11001101 结果uint32_t = 00000000 00000000 00 101011 11001101

Answer 1

Only the first and last bytes in the subsequence will involve some bit slicing to get the required bits out, while the intermediate bytes can be shifted in whole into the result. 仅子序列中的第一个和最后一个字节会涉及一些位切片，以取出所需的位，而中间字节可整体移入结果中。 Here's some sample code, absolutely untested -- it does what I described, but some of the bit indices could be off by one: 这是一些未经测试的示例代码，它确实满足了我的描述，但是有些位索引可能相差一个：

uint8_t bytes[];
int X, L;

uint32_t result;

int startByte  = X / 8,  /* starting byte number */
    startBit   = 7 - X % 8,  /* bit index within starting byte, from LSB */
    endByte    = (X + L) / 8, /* ending byte number */
    endBit     = 7 - (X + L) % 8; /* bit index within ending byte, from LSB */

/* Special case where start and end are within same byte:
   just get bits from startBit to endBit */
if (startByte == endByte) {
  uint8_t byte = bytes[startByte];
  result = (byte >> endBit) & ((1 << (startBit - endBit)) - 1);
}
/* All other cases: get ending bits of starting byte,
                    all other bytes in between,
                    starting bits of ending byte */
else {
  uint8_t byte = bytes[startByte];
  result = byte & ((1 << startBit) - 1);

  for (int i = startByte + 1; i < endByte; i++)
    result = (result << 8) | bytes[i];

  byte = bytes[endByte];
  result = (result << (8 - endBit)) | (byte >> endBit);
}

Answer 2

看一下std :: bitset和boost :: dynamic_bitset。

Answer 3

I would be thinking something like loading a uint64_t with a cast and then shifting left and right to lose the uninteresting bits. 我会在想类似使用强制转换加载uint64_t，然后左右移动以丢失无趣的位的事情。

uint32_t extract_bits(uint8_t* bytes, int start, int count)
{
    int shiftleft =  32+start;
    int shiftright = 64-count;
    uint64_t *ptr = (uint64_t*)(bytes);
    uint64_t hold = *ptr;
    hold <<= shiftleft;
    hold >>= shiftright;
    return (uint32_t)hold;
}

Answer 4

For the sake of completness, i'am adding my solution inspired by the comments and answers here. 为了完善起见，我在此添加受评论和答案启发的解决方案。 Thanks to all who bothered to think about the problem. 感谢所有费心思考这个问题的人。

static const uint8_t firstByteMasks[8] = { 0xFF, 0x7F, 0x3F, 0x1F, 0x0F, 0x07, 0x03, 0x01 };

uint32_t getBits( const uint8_t *buf, const uint32_t bitoff, const uint32_t len, const uint32_t bitcount )
{
    uint64_t result = 0;

    int32_t startByte = bitoff / 8; // starting byte number
    int32_t endByte = ((bitoff + bitcount) - 1) / 8; // ending byte number
    int32_t rightShift = 16 - ((bitoff + bitcount) % 8 );

    if ( endByte >= len ) return -1;

    if ( rightShift == 16 ) rightShift = 8; 

    result = buf[startByte] & firstByteMasks[bitoff % 8];
    result = result << 8;

    for ( int32_t i = startByte + 1; i <= endByte; i++ )
    {
        result |= buf[i];
        result = result << 8;
    }
    result = result >> rightShift;
    return (uint32_t)result;
}

Few notes: i tested the code and it seems to work just fine, however, there may be bugs. 很少注意：我测试了代码，它似乎工作正常，但是，可能存在错误。 If i find any, i will update the code here. 如果发现任何问题，我将在此处更新代码。 Also, there are probably better solutions! 另外，可能还有更好的解决方案！

给定一个uint8_t数组，提取位的任何子序列作为uint32_t的好方法是什么？

问题描述

4 个解决方案

解决方案1
4 已采纳 2010-12-14 16:08:21

解决方案2
1 2010-12-14 15:48:07

解决方案3
1 2010-12-14 15:53:19

解决方案4
1 2010-12-15 10:22:58

给定一个uint8_t数组，提取位的任何子序列作为uint32_t的好方法是什么？

问题描述

4 个解决方案

解决方案1 4 已采纳 2010-12-14 16:08:21

解决方案2 1 2010-12-14 15:48:07

解决方案3 1 2010-12-14 15:53:19

解决方案4 1 2010-12-15 10:22:58

解决方案1
4 已采纳 2010-12-14 16:08:21

解决方案2
1 2010-12-14 15:48:07

解决方案3
1 2010-12-14 15:53:19

解决方案4
1 2010-12-15 10:22:58