Bit shifting `char` vs. `unsigned char`

Question

I need to convert 2 bytes in char pcm[] to a 1 byte short pcm_[] . This post used a C-style cast, which at first I tried out in my C++ program (using Qt):

#include <QCoreApplication>

#include <QDebug>

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    char pcm[2] = {0xA1, 0x12};
    qDebug()<<pcm[0]<<pcm[1];

    short pcm_ = ( pcm[1] << 8 )| pcm[0];
    qDebug()<<pcm_;

    short pcm_2 =  ((unsigned char)(pcm[1])) << 8| (unsigned char) pcm[0];
    qDebug()<<pcm_2;

    return a.exec();
}

I figured out that it only works if I use unsigned char in the bit shifting, but do not understand, why this is necessary as the input is a char .

Moreover, I would like to use C++-style-cast, and came up with this one:

short pcm_3 = (static_cast<unsigned char>(pcm[1])) << 8|
               static_cast<unsigned char>(pcm[0]);
qDebug()<<pcm_3;

Again, I need to use unsigned char instead of char .

So I have 2 questions:

Is static_cast the right cast? In my mind is an example from somewhere that used a reinterpret_cast . However, the reinterpret cast does not work.
Why do I have to use unsigned char ?

Answer 1

According to the C Standard (6.5.11 Bitwise exclusive OR operator)

3 The usual arithmetic conversions are performed on the operands

The same is written in the C++ Standard (5.13 Bitwise inclusive OR operator)

1 The usual arithmetic conversions are performed;

The usual arithmetic conversions include the integer promotions. This means that in this expression

( pcm[1] << 8 )| pcm[0];

operand pcm[0] is promoted to type int . If according to settings of your compiler type char behaves like type signed char then you get that value 0xA1 is promoted to signed int 0xFFFFFFA1 (provided that sizeof( int ) is equal to 4). That is the sign bit will be propogated.

Hence you will get an incorrect result. To avoid it you shoud cast type char to type unsigned char In this case the promoted value will look like 0x000000A1 . In C++ it can be written like

static_cast<unsigned char>( pcm[0] )

Answer 2

The problem starts here:

char pcm[2] = {0xA1, 0x12};

On your system, char is signed, and has a range of -128 through 127 . You try to assign 161 to a char . This is out of range.

In C and C++ the result of out-of-range assignment is implementation-defined . Usually, the compiler decides to go with the char with the same representation, which is -95 .

Then you promote this to int (by virtue of using it as operand of | ), giving the int value -95 which has a representation starting with lots of 1 bits.

If you actually want to work with the value 161 you will need to use a data type that can hold that value, such as unsigned char . The simplest way is to make pcm[] have that type, rather than using casts.

Answer 3

You have to use unsigned char because of the promotion to int with operator |

Assuming int is 32 bits:

signed char 0xA1 becomes int 0xFFFFFFA1 (to keep same value)
unsigned char 0xA1 becomes 0x000000A1 .

Answer 4

The reason you need to cast char to unsigned char is that char is allowed to be a signed data type. In this case it would be sign-extended before performing the | , meaning that the lower half would become negative for char s with the most significant bit set to 1 :

char c = 200;
int a = c | 0; // returns -56 on systems where char is signed

In this example using static_cast or the C cast is a matter of style. Many C++ shops stay away from C casts, because they are harder to find in the source code, while static_cast s are much easier to spot.

Answer 5

You shall cast data type to unsigned, because when you "expand" a signed character to signed short it's 7th bit gets replicated to bits 8-15 of the short. So, from A1 which is 10100001 you get 1111111110100001 .
According to this question and answer , reinterpret_cast is the last cast you should think of.

Bit shifting `char` vs. `unsigned char`

Question

5 answers

solution1
3 2015-07-19 11:59:40

solution2
1 2015-07-19 12:14:30

solution3
0 2015-07-19 11:59:58

solution4
0 2015-07-19 12:00:53

solution5
0 2015-07-19 12:07:01

Bit shifting `char` vs. `unsigned char`

Question

5 answers

solution1 3 2015-07-19 11:59:40

solution2 1 2015-07-19 12:14:30

solution3 0 2015-07-19 11:59:58

solution4 0 2015-07-19 12:00:53

solution5 0 2015-07-19 12:07:01

solution1
3 2015-07-19 11:59:40

solution2
1 2015-07-19 12:14:30

solution3
0 2015-07-19 11:59:58

solution4
0 2015-07-19 12:00:53

solution5
0 2015-07-19 12:07:01