简体   繁体   English

C 在 short 和 int 之间转换的规则是什么?

[英]What is the rule for C to cast between short and int?

I'm confused when using C to cast between short and int.使用 C 在 short 和 int 之间进行转换时我很困惑。 I assume short is 16-bit and int is 32-bit.我假设 short 是 16 位,int 是 32 位。 I tested with below code:我用下面的代码进行了测试:

unsigned short a = 0xFFFF;
signed short b = 0xFFFF;

unsigned int u16tou32 = a;
unsigned int s16tou32 = b;
signed int u16tos32 = a;
signed int s16tos32 = b;

printf("%u %u %d %d\n", u16tou32, s16tou32, u16tou32, s16tou32);

What I got is:我得到的是:

  • u16tou32: 65535 u16tou32:65535
  • s16tou32: 4294967295 s16tou32:4294967295
  • u16tos32: 65535 u16tos32:65535
  • s16tos32: -1 s16tos32:-1

What I am confused with is the conversion between s16 to u32, as well as u16 to s32.我感到困惑的是 s16 到 u32 之间的转换,以及 u16 到 s32 之间的转换。 Seems like s16 to u32 is doing a "sign extension", while u16 to s32 is not.似乎 s16 到 u32 正在进行“符号扩展”,而 u16 到 s32 则没有。 What exactly is the rule behind this?这背后的规则究竟是什么? Also is this implementation-dependent?这也是依赖于实现的吗? Is it safe to do this type of casting in C, or should I use bit manipulation myself to avoid unexpected results?在 C 中进行这种类型的转换是否安全,还是我应该自己使用位操作来避免意外结果?

Anytime an integer type is being converted to a different integer type it falls through a deterministic pachinko machine of rules as dictated by the standard and on one occasion, the implementation.任何时候将整数类型转换为不同的整数类型时,它都会按照标准规定的确定性弹球机规则,有时还需要执行。

The general overview on value-qualification:值限定的一般概述:

C99 6.3.1.1-p2 C99 6.3.1.1-p2

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int;如果 int 可以表示原始类型的所有值(受宽度限制,对于位域),则将该值转换为 int; otherwise, it is converted to an unsigned int .否则,它被转换为unsigned int These are called the integer promotions.这些被称为整数提升。 All other types are unchanged by the integer promotions.整数提升不会改变所有其他类型。

That said, lets look at your conversions.也就是说,让我们看看您的转换。 The signed-short to unsigned int is covered by the following, since the value being converted falls outside the unsigned int domain:以下涵盖了signed-shortunsigned int的内容,因为被转换的落在unsigned int域之外:

C99 6.3.1.3-p2 C99 6.3.1.3-p2

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.否则,如果新类型是无符号的,则通过重复加或减一个新类型可以表示的最大值来转换该值,直到该值在新类型的范围内。

Which basically means "add UINT_MAX+1".这基本上意味着“添加 UINT_MAX+1”。 On your machine, UINT_MAX is 4294967295, therefore, this becomes在你的机器上,UINT_MAX 是 4294967295,因此,这变成

-1 + 4294967295 + 1 = 4294967295

Regarding your unsigned short to signed int conversion, that is covered by the regular value-quaified promotion.关于您的unsigned short到有signed int转换,常规价值限定促销涵盖了这一点。 Specifically:具体来说:

C99 6.3.1.3-p1 C99 6.3.1.3-p1

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.当一个整数类型的值转换为_Bool以外的其他整数类型时,如果该值可以用新类型表示,则不变。

In other words, because the value of your unsigned short falls within the coverable domain of signed int , there is nothing special done and the value is simply saved.换句话说,因为你的unsigned short的值落在了signed int的可覆盖域内,所以没有什么特别的事情,只是简单地保存了这个值。

And finally, as mentioned in general-comment above, something special happens to your declaration of b最后,正如上面的一般评论中提到的,你的b声明发生了一些特殊的事情

signed short b = 0xFFFF;

The 0xFFFF in this case is a signed integer.在这种情况下,0xFFFF 是一个有符号整数。 The decimal value is 65535. However, that value is not representable by a signed short so yet-another conversion happens, one that perhaps you weren't aware of:十进制值为 65535。但是,该值不能用有signed short表示,因此发生了另一种转换,您可能不知道:

C99 6.3.1.3-p3 C99 6.3.1.3-p3

Otherwise, the new type is signed and the value cannot be represented in it;否则,新类型是有符号的,值不能在其中表示; either the result is implementation-defined or an implementation-defined signal is raised.要么结果是实现定义的,要么引发实现定义的信号。

In other words, your implementation chose to store it as (-1) , but you cannot rely on that on a different implementation.换句话说,您的实现选择将其存储为(-1) ,但您不能依赖于不同的实现。

What's happening here is that the right-hand-side of the argument is first extended from 16 to 32 bits, and the conversion to the left-hand-side type only happens at assignment.这里发生的事情是,参数的右侧首先从 16 位扩展到 32 位,而向左侧类型的转换仅发生在赋值时。 This means that if the right-hand-side is signed, then it will be sign-extended when it's converted to 32 bits, and likewise if it's unsigned then it will just be zero-padded.这意味着如果右侧是有符号的,那么当它转换为 32 位时将被符号扩展,同样如果它是无符号的,那么它只会被零填充。

If you're careful with your casts then there shouldn't be any problem—but unless you're doing something super performance-intensive then the extra couple of bitwise operations shouldn't hurt anything.如果你对你的转换很小心,那么应该没有任何问题——但除非你正在做一些超级性能密集型的事情,否则额外的几个按位操作应该不会有任何伤害。

On another note, if you're doing anything where you're assuming certain bit-widths for different integer types, you should really be explicit and use the types defined in stdint.h .另一方面,如果您在为不同的整数类型假设某些位宽的情况下做任何事情,您应该真正明确并使用stdint.h 中定义的类型。 I just recently got bit by this while porting (someone else's) code from *nix to Windows, as the Visual C++ compiler uses a different convention for integer sizes (LLP64) than that on any other x64 or power-7 compiler I've used (LP64).我最近在将(其他人的)代码从 *nix 移植到 Windows 时遇到了这个问题,因为 Visual C++ 编译器使用的整数大小约定(LLP64)与我使用过的任何其他 x64 或 power-7 编译器不同(LP64)。 In short, if you want 32 bits, you're better off saying it explicitly with a type like uint32_t .简而言之,如果您想要 32 位,最好使用uint32_t类的类型明确表示。


So this will always hold when such conversion happens in C?所以当这种转换发生在 C 中时,这将始终成立? defined by C standard?由 C 标准定义? – Jun – 君

Yes, it should always hold.是的,它应该始终保持。 Relevant quotes (with links) from the C99 standard: "The integer promotions preserve value including sign."来自 C99 标准的相关引用(带链接): “整数提升保留值,包括符号。” When handling usual arithmetic type conversions: "... the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands..."处理通常的算术类型转换时: “...对两个操作数执行整数提升。然后将以下规则应用于提升的操作数......”

As stated in the question, assume 16-bit short and 32-bit int .如问题中所述,假设 16 位short和 32 位int

unsigned short a = 0xFFFF;

This initializes a to 0xFFFF , or 65535 .这会将a初始化为0xFFFF65535 The expression 0xFFFF is of type int ;表达式0xFFFFint类型; it's implicitly converted to unsigned short , and the value is preserved.它被隐式转换为unsigned short ,并保留该值。

signed short b = 0xFFFF;

This is a little more complicated.这有点复杂。 Again, 0xFFFF is of type int .同样, 0xFFFFint类型。 It's implicitly converted to signed short -- but since the value is outside the range of signed short the conversion cannot preserve the value.它被隐式转换为有signed short ——但由于该值超出了有signed short的范围,因此转换无法保留该值。

Conversion of an integer to a signed integer type, when the value can't be represented, yields an implementation-defined value.当值无法表示时,将整数转换为有符号整数类型会产生实现定义的值。 In principle, the value of b could be anything between -32768 and +32767 inclusive.原则上, b的值可以是-32768+32767之间的任何值。 In practice, it will almost certainly be -1 .在实践中,它几乎肯定是-1 I'll assume for the rest of this that the value is -1 .我将假设其余部分的值为-1

unsigned int u16tou32 = a;

The value of a is 0xFFFF , which is converted from unsigned short to unsigned int . a值为0xFFFF ,它从unsigned short转换为unsigned int The conversion preserves the value.转换保留了价值。

unsigned int s16tou32 = b;

The value of b is -1 . b值为-1 It's converted to unsigned int , which clearly cannot store a value of -1 .它被转换为unsigned int ,它显然不能存储-1的值。 Conversion of an integer to an unsigned integer type (unlike conversion to a signed type) is defined by the language;将整数转换为无符号整数类型(与转换为有符号类型不同)由语言定义; the result is reduced modulo MAX + 1 , where MAX is the maximum value of the unsigned type.结果以MAX + 1为模减少,其中MAX是无符号类型的最大值。 In this case, the value stored in s16tou32 is UINT_MAX - 1 , or 0xFFFFFFFF .在这种情况下,存储在s16tou32值为UINT_MAX - 10xFFFFFFFF

signed int u16tos32 = a;

The value of a , 0xFFFF , is converted to signed int .的值a0xFFFF ,被转换为signed int The value is preserved.该值被保留。

signed int s16tos32 = b;

The value of b , -1 , is converted to signed int . b的值-1被转换为有signed int The value is preserved.该值被保留。

So the stored values are:所以存储的值是:

a == 0xFFFF (65535)
b == -1     (not guaranteed, but very likely)
u16tou32 == 0xFFFF (65535)
s16tou32 == 0xFFFFFFFF (4294967295)
u16tos32 == 0xFFFF (65535)
s16tos32 == -1

To summarize the integer conversion rules:总结整数转换规则:

If the target type can represent the value, the value is preserved.如果目标类型可以表示值,则保留该值。

Otherwise, if the target type is unsigned, the value is reduced modulo MAX+1 , which is equivalent to discarding all but the low-order N bits.否则,如果目标类型是无符号的,则该值以MAX+1为模减少,这等效于丢弃除低 N 位之外的所有位。 Another way to describe this is that the value MAX+1 is repeatedly added to or subtracted from the value until you get a result that's in the range (this is actually how the C standard describes it).另一种描述方法是,值MAX+1被重复添加到该值或从该值中减去,直到您得到一个在范围内的结果(这实际上是 C 标准描述它的方式)。 Compilers don't actually generate code to do this repeated addition or subtraction;编译器实际上并不生成代码来执行这种重复的加法或减法; they just have to get the right result.他们只需要得到正确的结果。

Otherwise, the target type is signed and cannot represent the value;否则,目标类型是有符号的,不能表示值; the conversion yields an implementation-defined value.转换产生一个实现定义的值。 In almost all implementations, the result discards all but the low-order N bits using a two's-complement representation.在几乎所有实现中,结果使用二进制补码表示丢弃除低序 N 位之外的所有位。 (C99 added a rule for this case, permitting an implementation-defined signal to be raised instead. I don't know of any compiler that does this.) (C99 为这种情况添加了一条规则,允许引发实现定义的信号。我不知道有任何编译器会这样做。)

This is an unsigned short representation of the number 65535:这是数字 65535 的无符号短表示形式:

unsigned short a = 0xFFFF;

This is a signed short representation of the number -1:这是数字 -1 的有符号简短表示:

signed short b = 0xFFFF;

Simple promotion from unsigned short to unsigned int, so u16tou32 is a unsigned int representation of the number 65535:从 unsigned short 到 unsigned int 的简单提升,因此 u16tou32 是数字 65535 的 unsigned int 表示:

unsigned int u16tou32 = a;

b (value of -1) is promoted to int. b(-1 的值)被提升为 int。 Thus its hex representation would be 0xFFFFFFFF.因此它的十六进制表示将是 0xFFFFFFFF。 It is then cast to unsigned, so is a representation of the number 4294967295:然后它被转换为无符号,因此是数字 4294967295 的表示:

unsigned int s16tou32 = b;

Promotion from unsigned short to unsigned int has value 65535. It is then case to signed int, which will also be a representation of the number 65535:从 unsigned short 到 unsigned int 的提升值为 65535。然后是signed int 的情况,这也将是数字 65535 的表示:

signed int u16tos32 = a;

Simple promotion of signed short to signed int, so s16tos32 is also a representation of the number -1:将 signed short 简单提升为 signed int,因此 s16tos32 也是数字 -1 的表示:

signed int s16tos32 = b;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM