简体   繁体   English

移位char和unsigned char

[英]Bit shifting `char` vs. `unsigned char`

I need to convert 2 bytes in char pcm[] to a 1 byte short pcm_[] . 我需要将char pcm[]中的2个字节转换为1个字节short pcm_[] This post used a C-style cast, which at first I tried out in my C++ program (using Qt): 这篇文章使用了C样式的强制转换,起初我在C ++程序中尝试过(使用Qt):

#include <QCoreApplication>

#include <QDebug>

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    char pcm[2] = {0xA1, 0x12};
    qDebug()<<pcm[0]<<pcm[1];

    short pcm_ = ( pcm[1] << 8 )| pcm[0];
    qDebug()<<pcm_;

    short pcm_2 =  ((unsigned char)(pcm[1])) << 8| (unsigned char) pcm[0];
    qDebug()<<pcm_2;

    return a.exec();
}

I figured out that it only works if I use unsigned char in the bit shifting, but do not understand, why this is necessary as the input is a char . 我发现只有当我在移位中使用unsigned char它才有效,但不明白为什么这是必须的,因为输入是char

Moreover, I would like to use C++-style-cast, and came up with this one: 而且,我想使用C ++-style-cast,并提出了这一点:

short pcm_3 = (static_cast<unsigned char>(pcm[1])) << 8|
               static_cast<unsigned char>(pcm[0]);
qDebug()<<pcm_3;

Again, I need to use unsigned char instead of char . 同样,我需要使用unsigned char而不是char

So I have 2 questions: 所以我有两个问题:

  • Is static_cast the right cast? static_cast是否正确投放? In my mind is an example from somewhere that used a reinterpret_cast . 在我看来,是一个使用reinterpret_cast的示例。 However, the reinterpret cast does not work. 但是,重新解释强制转换不起作用。
  • Why do I have to use unsigned char ? 为什么我必须使用unsigned char

According to the C Standard (6.5.11 Bitwise exclusive OR operator) 根据C标准(6.5.11按位异或运算符)

3 The usual arithmetic conversions are performed on the operands

The same is written in the C++ Standard (5.13 Bitwise inclusive OR operator) C ++标准(5.13按位包含或运算符)中编写了相同的内容

1 The usual arithmetic conversions are performed; 1执行通常的算术转换;

The usual arithmetic conversions include the integer promotions. 通常的算术转换包括整数提升。 This means that in this expression 这意味着在此表达式中

( pcm[1] << 8 )| pcm[0];

operand pcm[0] is promoted to type int . 将操作数pcm[0]提升为int类型。 If according to settings of your compiler type char behaves like type signed char then you get that value 0xA1 is promoted to signed int 0xFFFFFFA1 (provided that sizeof( int ) is equal to 4). 如果根据您的编译器的设置, char类型的行为类似于signed char类型,则您将值0xA1提升为signed int 0xFFFFFFA1 (假设sizeof(int)等于4)。 That is the sign bit will be propogated. 那就是符号位将被传播。

Hence you will get an incorrect result. 因此,您将得到错误的结果。 To avoid it you shoud cast type char to type unsigned char In this case the promoted value will look like 0x000000A1 . 为了避免这种情况,您应该将类​​型char强制转换为unsigned char类型。在这种情况下,提升后的值将看起来像0x000000A1 In C++ it can be written like 在C ++中可以这样写

static_cast<unsigned char>( pcm[0] ) 

The problem starts here: 问题从这里开始:

char pcm[2] = {0xA1, 0x12};

On your system, char is signed, and has a range of -128 through 127 . 在您的系统上, char是带符号的,范围为-128127 You try to assign 161 to a char . 您尝试将161分配给char This is out of range. 这超出范围。

In C and C++ the result of out-of-range assignment is implementation-defined . 在C和C ++中,超出范围分配的结果是实现定义的 Usually, the compiler decides to go with the char with the same representation, which is -95 . 通常,编译器决定使用具有相同表示形式的char -95

Then you promote this to int (by virtue of using it as operand of | ), giving the int value -95 which has a representation starting with lots of 1 bits. 然后,将其提升为int(通过将其用作|操作数),得到int值-95 ,该值的表示形式以1位开始。

If you actually want to work with the value 161 you will need to use a data type that can hold that value, such as unsigned char . 如果您实际上想使用值161 ,则需要使用可以保存该值的数据类型,例如unsigned char The simplest way is to make pcm[] have that type, rather than using casts. 最简单的方法是使pcm[]具有该类型,而不是使用强制类型转换。

You have to use unsigned char because of the promotion to int with operator | 您必须使用unsigned char因为升级为int with operator |

Assuming int is 32 bits: 假设int是32位:

  • signed char 0xA1 becomes int 0xFFFFFFA1 (to keep same value) 有符号的char 0xA1变为int 0xFFFFFFA1 (保持相同的值)
  • unsigned char 0xA1 becomes 0x000000A1 . 无符号字符0xA1变为0x000000A1

The reason you need to cast char to unsigned char is that char is allowed to be a signed data type. 您需要将charunsigned char是,允许char为有符号数据类型。 In this case it would be sign-extended before performing the | 在这种情况下,将在执行|之前对它进行符号扩展| , meaning that the lower half would become negative for char s with the most significant bit set to 1 : ,这意味着对于最高有效位设置为1 char ,下半部分将变为负数:

char c = 200;
int a = c | 0; // returns -56 on systems where char is signed 

In this example using static_cast or the C cast is a matter of style. 在此示例中,使用static_cast或C static_cast是样式问题。 Many C++ shops stay away from C casts, because they are harder to find in the source code, while static_cast s are much easier to spot. 许多C ++商店都远离C强制转换,因为它们很难在源代码中找到,而static_cast容易发现。

  1. You shall cast data type to unsigned, because when you "expand" a signed character to signed short it's 7th bit gets replicated to bits 8-15 of the short. 您应该将数据类型强制转换为无符号,因为当您“扩展”一个有符号的字符以对短进行签名时,它的第7位将被复制到该短的第8-15位。 So, from A1 which is 10100001 you get 1111111110100001 . 因此,从A110100001可以得到1111111110100001
  2. According to this question and answer , reinterpret_cast is the last cast you should think of. 根据此问题和答案 ,reinterpret_cast是您应该想到的最后一个演员。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM