简体   繁体   中英

Bitshift - Need explanation to understand the code

I was wondering what this function actually performs. To my understanding it should return pSrc[1].

So why does it bother left-shifting pSrc[0] by 8 bits, which zeroes out those 8 bits. And when these zeroes are ORed with pSrc[1], pSrc[1] is not affected so you get pSrc[1] anyway as if the bitwise OR had never happened.

/*
* Get 2 big-endian bytes.
*/
INLINE u2 get2BE(unsigned char const* pSrc)
{
    return (pSrc[0] << 8) | pSrc[1];
}

This function is from the source code of the dalvik virtual machine. https://android.googlesource.com/platform/dalvik/+/android-4.4.4_r1/vm/Bits.h

Update:

OK, now I got it thanks to all the answers here.

(1) pSrc[0] is originally an unsigned char (1 byte).

(2) When it is left-shifted (pSrc[0] << 8) with the literal 8 of int type, pSrc[0] is therefore int-promoted to a signed int (4 byte).

(3) The result of pSrc[0] << 8 is that the interested 8 bits in pSrc[0] are shifted over to the second byte of the 4 bytes of the signed int, thereby leaving zeroes in the other bytes(1st,3rd and 4th bytes).

(4) And when it is ORed ( intermediate result from step (3) | pSrc[1]), pSrc[1] is then int-promoted to a signed int (4 bytes).

(5) The result of ( intermediate result from step (3) | pSrc[1]) leaves the first two least significant bytes the way we want with zeroes all in the two most significant bytes.

(6) return only the first two least significant bytes to get the 2 big-endian bytes by returning the result as a u2 type.

For arithmetic operations like this, the unsigned char is converted via a process called integral promotions .

C++11 - N3485 §5.8 [expr.shift]/1:

The operands shall be of integral or unscoped enumeration type and integral promotions are performed. The type of the result is that of the promoted left operand.

And §13.6 [over.built]/17:

For every pair of promoted integral types L and R, there exist candidate operator functions of the form

 LR operator%(L , R ); LR operator&(L , R ); LR operator^(L , R ); LR operator|(L , R ); L operator<<(L , R ); L operator>>(L , R ); 

where LR is the result of the usual arithmetic conversions between types L and R.

When integral promotions are done (§4.5 [conv.prom]/1):

A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

By integral promotions, the unsigned char will be promoted to int . The other operand is already int , so no changes in type are made to it. The return type then becomes int as well.

Thus, what you have is the first unsigned char 's bits shifted left, but still in the now-bigger int , and then the second unsigned char 's bits at the end.

You'll notice that the return type of operator| is the result of usual arithmetic conversions between the two operands. At this point, those are the int from the shift and the second unsigned char .

This conversion is defined as follows (§5 [expr]/10):

Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions, which are defined as follows:

Otherwise, the integral promotions (4.5) shall be performed on both operands. Then the following rules shall be applied to the promoted operands:

If both operands have the same type, no further conversion is needed.

Since L and R , being promoted before this, are already int , the promotion leaves them the same and the overall return type of the expression is thus int , which is then converted to u2 , whatever that happens to be.

There are no operations (other than type conversions) on unsigned char . Before any operation, integral promotion occurs, which converts the unsigned char to an int . So the operation is shifting an int left, not an unsigned char .

C11 6.5.7 Bitwise shift operators

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

So pSrc[0] is integer promoted to an int . The literal 8 is already an int , so no integer promotion takes place. The usual arithmetic converstions do not apply to shift operators: they are a special case.

Since the original variable was an unsigned char which gets left shifted 8 bits, we also encounter the issue where "E1" (our promoted variable) is signed and potentially the result cannot be representable in the result type, which leads to undefined behavior if this is a 16 bit system.

In plain English: if you shift something into the sign bits of a signed variable, anything can happen. In general: relying on implicit type promotions is bad programming and dangerous practice.

You should fix the code to this:

((unsigned int)pSrc[0] << 8) | (unsigned int)pSrc[1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM