给定一个uint8_t数组，提取位的任何子序列作为uint32_t的好方法是什么？

Question

我最近遇到了一个有趣的问题：

可以说我有一个字节数组（确切地说是uint8_t），长度至少为一个。 现在，我需要一个函数，该函数将从该数组中获取位的子序列，从位X（基于零的索引，包括零在内）开始，长度为L，并将其返回为uint32_t。 如果L小于32，则剩余的高位应为零。

尽管这并不是很难解决的，但是我目前对如何执行此操作的想法对我来说似乎有点麻烦。 我正在考虑一个给定字节的所有可能掩码的表（从0-7位开始，取1-8位），然后使用该表一次构造一个字节。

有人可以提出更好的解决方案吗？ 请注意，我不能为此使用Boost或STL-不，这不是一项家庭作业，这是我在工作中遇到的一个问题，我们在使用此功能的代码中不使用Boost或STL。 您可以假定：0 <L <= 32，并且字节数组足够大以容纳子序列。

正确输入/输出的一个示例：

阵列：00110011 1010 1010 11110011 01 101100
子序列：X = 12（从零开始的索引），L = 14
结果uint32_t = 00000000 00000000 00 101011 11001101

Answer 1

仅子序列中的第一个和最后一个字节会涉及一些位切片，以取出所需的位，而中间字节可整体移入结果中。 这是一些未经测试的示例代码，它确实满足了我的描述，但是有些位索引可能相差一个：

uint8_t bytes[];
int X, L;

uint32_t result;

int startByte  = X / 8,  /* starting byte number */
    startBit   = 7 - X % 8,  /* bit index within starting byte, from LSB */
    endByte    = (X + L) / 8, /* ending byte number */
    endBit     = 7 - (X + L) % 8; /* bit index within ending byte, from LSB */

/* Special case where start and end are within same byte:
   just get bits from startBit to endBit */
if (startByte == endByte) {
  uint8_t byte = bytes[startByte];
  result = (byte >> endBit) & ((1 << (startBit - endBit)) - 1);
}
/* All other cases: get ending bits of starting byte,
                    all other bytes in between,
                    starting bits of ending byte */
else {
  uint8_t byte = bytes[startByte];
  result = byte & ((1 << startBit) - 1);

  for (int i = startByte + 1; i < endByte; i++)
    result = (result << 8) | bytes[i];

  byte = bytes[endByte];
  result = (result << (8 - endBit)) | (byte >> endBit);
}

Answer 2

看一下std :: bitset和boost :: dynamic_bitset。

Answer 3

我会在想类似使用强制转换加载uint64_t，然后左右移动以丢失无趣的位的事情。

uint32_t extract_bits(uint8_t* bytes, int start, int count)
{
    int shiftleft =  32+start;
    int shiftright = 64-count;
    uint64_t *ptr = (uint64_t*)(bytes);
    uint64_t hold = *ptr;
    hold <<= shiftleft;
    hold >>= shiftright;
    return (uint32_t)hold;
}

Answer 4

为了完善起见，我在此添加受评论和答案启发的解决方案。 感谢所有费心思考这个问题的人。

static const uint8_t firstByteMasks[8] = { 0xFF, 0x7F, 0x3F, 0x1F, 0x0F, 0x07, 0x03, 0x01 };

uint32_t getBits( const uint8_t *buf, const uint32_t bitoff, const uint32_t len, const uint32_t bitcount )
{
    uint64_t result = 0;

    int32_t startByte = bitoff / 8; // starting byte number
    int32_t endByte = ((bitoff + bitcount) - 1) / 8; // ending byte number
    int32_t rightShift = 16 - ((bitoff + bitcount) % 8 );

    if ( endByte >= len ) return -1;

    if ( rightShift == 16 ) rightShift = 8; 

    result = buf[startByte] & firstByteMasks[bitoff % 8];
    result = result << 8;

    for ( int32_t i = startByte + 1; i <= endByte; i++ )
    {
        result |= buf[i];
        result = result << 8;
    }
    result = result >> rightShift;
    return (uint32_t)result;
}

很少注意：我测试了代码，它似乎工作正常，但是，可能存在错误。 如果发现任何问题，我将在此处更新代码。 另外，可能还有更好的解决方案！

给定一个uint8_t数组，提取位的任何子序列作为uint32_t的好方法是什么？

问题描述

4 个解决方案

解决方案1
4 已采纳 2010-12-14 16:08:21

解决方案2
1 2010-12-14 15:48:07

解决方案3
1 2010-12-14 15:53:19

解决方案4
1 2010-12-15 10:22:58

给定一个uint8_t数组，提取位的任何子序列作为uint32_t的好方法是什么？

问题描述

4 个解决方案

解决方案1 4 已采纳 2010-12-14 16:08:21

解决方案2 1 2010-12-14 15:48:07

解决方案3 1 2010-12-14 15:53:19

解决方案4 1 2010-12-15 10:22:58

解决方案1
4 已采纳 2010-12-14 16:08:21

解决方案2
1 2010-12-14 15:48:07

解决方案3
1 2010-12-14 15:53:19

解决方案4
1 2010-12-15 10:22:58