简体   繁体   English

转换有符号的 int 与 short 时符号扩展的不一致

[英]Inconsistencies in sign extension when shifting signed int vs short

int main(){
  signed int a = 0b00000000001111111111111111111111; 
  signed int b = (a << 10) >> 10;
  // b is: 0b11111111111111111111111111111111

  signed short c = 0b0000000000111111; 
  signed short d = (c << 10) >> 10;
  // d is: 0b111111

  return 0;
}

Assuming int is 32 bits and short is 16 bits,假设int是 32 位,而short是 16 位,

Why would b get sign extended but d does not get sign extended?为什么b会扩展符号而d不会扩展符号? I have tested this with gdb on x64, compiled with gcc.我已经在 x64 上用 gdb 测试了这个,用 gcc 编译。

In order to get short sign extended, I had to use two separate variables like this:为了扩展short符号,我不得不使用两个单独的变量,如下所示:

  signed short f = c << 10;
  signed short g = f >> 10;
  // g is: 0b1111111111111111

In the case of signed short , when an integer type smaller than int is used in an expression it is (in most cases) promoted to type int .signed short的情况下,当在表达式中使用小于int的 integer 类型时,它(在大多数情况下)被提升为int类型。 This is spelled out in section 6.3.1.1p2 of the C standard :这在C 标准的第 6.3.1.1p2 节中有详细说明:

The following may be used in an expression wherever an int or unsigned int may be used可以在可以使用intunsigned int的表达式中使用以下内容

  • An object or expression with an integer type (other than int or unsigned int ) whose integer conversion rank is less than or equal to the rank of int and unsigned int . object 或具有 integer 类型(除intunsigned int之外)的表达式,其 integer 转换等级小于或等于intunsigned int等级。
  • A bit-field of type _Bool , int , signed int ,or unsigned int . _Boolintsigned intunsigned int类型的位域。

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int ;如果int可以表示原始类型的所有值(受宽度限制,对于位域),则该值将转换为int otherwise, it is converted to an unsigned int .否则,它将转换为unsigned int These are called the integer promotions All other types are unchanged by the integer promotions这些被称为integer 促销活动 integer 促销活动不会改变所有其他类型

And this promotion specifically happens in the case of bitwise shift operators as specified in section 6.5.7p3:这种提升特别发生在第 6.5.7p3 节中指定的按位移位运算符的情况下:

The integer promotions are performed on each of the operands. integer 提升在每个操作数上执行。 The type of the result is that of the promoted left operand.结果的类型是提升的左操作数的类型。 If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.如果右操作数的值为负数或大于或等于提升的左操作数的宽度,则行为未定义。

So the short value 0x003f is promoted to the int value 0x0000003f and the left shift is applied.因此将short值 0x003f 提升为int值 0x0000003f 并应用左移。 This results in 0x0000fc00, and the right shift gives a result of 0x0000003f.这导致 0x0000fc00,右移得到 0x0000003f 的结果。

The signed int case is a bit more interesting.signed int案例更有趣一些。 In this case you're left-shifting a bit with the value 1 into the sign bit.在这种情况下,您将值 1 的位左移到符号位。 This triggers undefined behavior as per 6.5.7p4:这会根据 6.5.7p4 触发未定义的行为

The result of E1 << E2 is E1 left-shifted E2 bit positions; E1 << E2的结果是E1左移E2位位置; vacated bits are filled with zeros.空出的位用零填充。 If E1 has an unsigned type, the value of the result is E1×2 E2 , reduced modulo one more than the maximum value representable in the result type.如果E1具有无符号类型,则结果的值是E1×2 E2 ,比结果类型中可表示的最大值多模一减少。 If E1 has a signed type and nonnegative value, and E1×2 E2 is representable in the result type, then that is the resulting value;如果E1有带符号类型和非负值,并且E1×2 E2在结果类型中是可表示的,那么这就是结果值; otherwise, the behavior is undefined.否则,行为未定义。

So while the output you get for the signed int case is what you might expect it to be, it's actually undefined behavior and so you can't depend on that result.因此,虽然 output 得到的signed int案例是您可能期望的,但它实际上是未定义的行为,因此您不能依赖该结果。

short is automatically converted to int by the integer promotions , per C 2018 6.5.7 3: integer 促销活动根据 C 2018 6.5.7 3 自动将short转换为int

The integer promotions are performed on each of the operands… integer 提升在每个操作数上执行……

So (c << 10) shifts an int 0b111111 left 10 bits, yielding (in your C implementation) the 32-bit int 0b00000000000000001111110000000000.因此(c << 10)int 0b111111向左移动 10 位,产生(在您的 C 实现中)32 位int 0b00000000000000001111110000000000。 The sign bit in that is zero;其中的符号位为零; it is a positive number.这是一个正数。

When you do signed short f = c << 10;当您signed short f = c << 10; , the result of c << 10 is too big to fit in a signed short . c << 10的结果太大而无法放入有signed short It is 64,512, which is above the largest value your signed short can represent, 32,767.它是 64,512,高于您的signed short可以代表的最大值 32,767。 In an assignment, the value is converted to the type of the left operand.在赋值中,值被转换为左操作数的类型。 Per C 2018 6.3.1.3 3, the conversion is implementation-defined.根据 C 2018 6.3.1.3 3,转换是实现定义的。 GCC defines this conversion to wrap modulo 65,536 (two the power of the number of bits in the type). GCC 将此转换定义为以 65,536 为模(类型中位数的两倍)。 So converting 64,512 yields 64,512 − 65,536 = −1024.所以转换 64,512 得到 64,512 − 65,536 = −1024。 So f is set to −1024.所以f设置为 -1024。

Then, in f >> 10 , you are shifting a negative value.然后,在f >> 10中,您正在移动一个负值。 As signed short , f is still promoted to int , but this conversion keeps the value, resulting in an int value of −1024.作为有signed shortf仍被提升为int ,但这种转换保留了该值,导致int值为 -1024。 This is then shifted.然后转移。 This shift is implementation-defined, and GCC defines it to shift with sign extension .此移位是实现定义的,并且GCC 将其定义为使用符号扩展移位 So the result of -1024 >> 10 is −1.所以-1024 >> 10的结果是 -1。

For starters according to the C Standard (6.5.7 Bitwise shift operators)对于符合 C 标准的初学者(6.5.7 位移位运算符)

3 The integer promotions are performed on each of the operands. 3 对每个操作数执行 integer 提升。 The type of the result is that of the promoted left operand.结果的类型是提升的左操作数的类型。

Thus this value因此这个值

signed short c = 0b0000000000111111;

in the expression used in this declaration在此声明中使用的表达式中

signed short d = (c << 10) >> 10;

is promoted to the integer type int .被提升为 integer 类型int As the value is positive then the promoted values is also positive.由于值为正,因此提升的值也是正的。

Thus this operation因此这个操作

c << 10

does not touch the sign bit.不接触符号位。

On the other hand this code snippet另一方面,这段代码片段

signed int a = 0b00000000001111111111111111111111; 
signed int b = (a << 10) >> 10;

has undefined behavior because according to same section of the C Standard具有未定义的行为,因为根据 C 标准的同一部分

4 The result of E1 << E2 is E1 left-shifted E2 bit positions; 4 E1 << E2的结果是E1左移E2位; vacated bits are filled with zeros.空出的位用零填充。 If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type.如果 E1 具有无符号类型,则结果的值为 E1 × 2E2,比结果类型中可表示的最大值多模一减少。 If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value;如果 E1 具有带符号类型和非负值,并且 E1 × 2E2 在结果类型中是可表示的,那么这就是结果值; otherwise, the behavior is undefined.否则,行为未定义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM