简体   繁体   English

从字符中提取位序列

[英]Extract a bit sequence from a character

So I have an array of characters like the following {h,e,l,l,o,o} so I need first to translate this to its bit representation, so what I would have is this 所以我有一个像下面的{h,e,l,l,o,o}这样的字符数组,所以我首先需要将它转换为它的位表示,所以我要的是这个

h = 01101000
e = 01100101
l = 01101100
l = 01101100
o = 01101111
o = 01101111

I need divide all of this bits in groups of five and save it to an array so for example the union of all this characters would be 我需要将所有这些位分成五个一组并将其保存到一个数组中,例如所有这些字符的并集

011010000110010101101100011011000110111101101111

And now I divide this in groups of five so 现在我把这个分为五组

01101 00001 10010 10110 11000 11011 00011 01111 01101 111

and the last sequence should be completed with zeros so it would be 00111 instead. 最后一个序列应该用零完成,所以它将是00111。 Note: Each group of 5 bits would be completed with a header in order to have 8 bits. 注意:每组5位将用标题填写,以便有8位。

So I havent realized yet how to accomplish this, because I can extract the 5 bits of each character and get the representation of each character in binary as following 所以我还没有意识到如何实现这一点,因为我可以提取每个字符的5位并获得二进制中每个字符的表示如下

 for (int i = 7; i >= 0; --i)
  {
     printf("%c", (c & (1 << i)) ? '1' : '0');
  }

The problem is how to combine two characters so If I have two characters 00000001 and 11111110 when I divide in five groups I would have 5 bits of the first part of the character and for the second group I would have 3 bits from the last character and 2 from the second one. 问题是如何组合两个字符所以如果我有两个字符00000001和11111110当我分为五组时,我将有5位字符的第一部分,而对于第二组我将有3位来自最后一个字符和2从第二个。 How can I make this combination and save all this groups in an array? 如何进行此组合并将所有这些组保存在数组中?

Assuming that a byte is made of 8 bits ( ATTENTION: the C standard doesn't guarantee this ), you have to loop over the string and play with bit operations to get it done: 假设一个字节由8位组成( 注意:C标准不保证这一点 ),你必须遍历字符串并使用位操作来完成它:

  • >> n right shift to get rid of the n lowest bits >> n右移以摆脱n个最低位
  • << n to inject n times a 0 bit in the lowest position << n在最低位置注入n次0位
  • & 0x1f to keep only the 5 lowest bits and reset the higer bits & 0x1f仅保留5个最低位并复位高位
  • | to merge high bits and low bits, when the overlapping bits are 0 当重叠位为0时,合并高位和低位

This can be coded like this: 这可以这样编码:

char s[]="helloo";

unsigned char last=0;          // remaining bits from previous iteration in high output part
size_t j=5;                    // number of high input bits to keep in the low output part 
unsigned char output=0; 
for (char *p=s; *p; p++) {     // iterate on the string 
    do {
        output = ((*p >> (8-j)) | last) & 0x1f;  // last high bits set followed by j bits shifted to lower part; only 5 bits are kept 
        printf ("%02x ",(unsigned)output);
        j += 5;                                  // take next block  
        last = (*p << (j%8)) & 0x1f;             // keep the ignored bits for next iteration 
    } while (j<8);                               // loop if second block to be extracted from current byte
    j -= 8;                                      
}
if (j)                                           // there are trailing bits to be output
   printf("%02x\n",(unsigned)last); 

online demo 在线演示

The displayed result for your example will be (in hexadecimal): 0d 01 12 16 18 1b 03 0f 0d 1c , which corresponds exactly to each of the 5 bit groups that you have listed. 您的示例的显示结果将是(十六进制): 0d 01 12 16 18 1b 03 0f 0d 1c ,它与您列出的每个5位组完全对应。 Note that this code ads 0 right padding in the last block if it is not exactly 5 bits long (eg here the last 3 bits are padded to 11100 ie 0x1C instead of 111 which would be 0x0B) 请注意,如果此代码广告在最后一个块中填充正确填充,如果它不完全是5位长(例如,此处最后3位填充为11100,即0x1C而不是111,这将是0x0B)

You could easily adapt this code to store the output in a buffer instead of printing it. 您可以轻松地调整此代码以将输出存储在缓冲区中而不是打印它。 The only delicate thing would be to precalculate the size of the output which should be 8/5 times the original size, to be increased by 1 if it's not a multiple of 5 and again by 1 if you expect a terminator to be added. 唯一微妙的事情是预先计算输出的大小,该大小应该是原始大小的8/5倍,如果它不是5的倍数则增加1,如果你期望添加终结符则再增加1。

Here is some code that should solve your problem: 以下是一些可以解决您问题的代码:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char arr[6] = {'h', 'e', 'l', 'l', 'o', 'o'};
    char charcode[9];
    char binarr[121] = "";
    char fives[24][5] = {{0}};
    int i, j, n, numchars, grouping = 0, numgroups = 0;

    /* Build binary string */
    printf("\nCharacter encodings:\n");
    for (j = 0; j < 6; j++) {
        for (i = 0, n = 7;  i < 8; i++, n--)
            charcode[i] = (arr[j] & (01 << n)) ? '1' : '0';
        charcode[8] = '\0';
        printf("%c = %s\n", arr[j], charcode);
        strcat(binarr, charcode);
    }

    /* Break binary string into groups of 5 characters */
    numchars = strlen(binarr);
    j = 0;
    while (j < numchars) {
        i = 0;
        if ((numchars - j) < 5) {                 // add '0' padding
            for (i = 0; i < (5 - (numchars - j)); i++)
                fives[grouping][i] = '0';
        }
        while (i < 5) {                           // write binary digits
            fives[grouping][i] = binarr[j];
            ++i;
            ++j;
        }
        ++grouping;
        ++numgroups;
    }

    printf("\nConcatenated binary string:\n");
    printf("%s\n", binarr);

    printf("\nGroupings of five, with padded final grouping:\n");
    for (grouping = 0; grouping <= numgroups; grouping++) {
        for (i = 0; i < 5; i++)
            printf("%c", fives[grouping][i]);
        putchar(' ');
    }
    putchar('\n');

    return 0;
}

When you run this as is, the output is: 当你按原样运行时,输出是:

Character encodings:
h = 01101000
e = 01100101
l = 01101100
l = 01101100
o = 01101111
o = 01101111

Concatenated binary string:
011010000110010101101100011011000110111101101111

Groupings of five, with padded final grouping:
01101 00001 10010 10110 11000 11011 00011 01111 01101 00111  
#include <limits.h>
#include <stdio.h>

#define GROUP_SIZE 5

static int nextBit(void);
static int nextGroup(char *dest);

static char str[] = "helloo";

int main(void) {
    char bits[GROUP_SIZE + 1];
    int firstTime, nBits;

    firstTime = 1;
    while ((nBits = nextGroup(bits)) == GROUP_SIZE) {
        if (!firstTime) {
            (void) putchar(' ');
        }
        firstTime = 0;
        (void) printf("%s", bits);
    }
    if (nBits > 0) {
        if (!firstTime) {
            (void) putchar(' ');
        }
        while (nBits++ < GROUP_SIZE) {
            (void) putchar('0');
        }
        (void) printf("%s", bits);
    }
    (void) putchar('\n');
    return 0;
}

static int nextBit(void) {
    static int bitI = 0, charI = -1;

    if (--bitI < 0) {
        bitI = CHAR_BIT - 1;
        if (str[++charI] == '\0') {
            return -1;
        }
    }
    return (str[charI] & (1 << bitI)) != 0 ? 1 : 0;
}

static int nextGroup(char *dest) {
    int bit, i;

    for (i = 0; i < GROUP_SIZE; ++i) {
        bit = nextBit();
        if (bit == -1) {
            break;
        }
        dest[i] = '0' + bit;
    }
    dest[i] = '\0';
    return i;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM