简体繁体 English

为什么对无符号字符的算术运算将它们提升为有符号整数？

[英]Why do arithmetic operations on unsigned chars promote them to signed integers?

原文 2020-05-27 09:34:42 4 1 c++/ c

Many answers to similar questions point out that it is so due to the standard.许多类似问题的答案都指出，这是由于标准所致。 But, I cannot understand the reasoning behind this decision by the standard setters.但是，我无法理解标准制定者做出这一决定的原因。

From my understanding an unsigned char does not store the value in 2's complement form.据我了解， unsigned char不会以 2 的补码形式存储值。 So, I don't see a situation where let's say XOR ing two unsigned chars would produce unexpected behavior.因此，我没有看到假设XOR两个unsigned chars会产生意外行为的情况。 Therefore, promoting them to int just seems like a waste of space (in most cases) and CPU cycles.因此，将它们提升为int似乎是在浪费空间（在大多数情况下）和 CPU 周期。

Moreover, why int ?此外，为什么int ？ If a variable is being declared as unsigned , clearly the unsignedness is important to the programmer, therefore a promotion to an unsigned int would still make more sense than an int , in my opinion.如果一个变量被声明为unsigned ，显然unsigned对程序员来说很重要，因此在我看来，升级为unsigned int仍然比int更有意义。

[EDIT #1] As stated out in the comments, promotion to unsigned int will take place if an int cannot sufficiently accommodate the value in the unsigned char . [编辑#1] 如评论中所述，如果int无法充分容纳unsigned char中的值，则将升级为unsigned int 。

[EDIT #2] To clarify the question, if it is about the performance benefit of operating over int than char , then why is it in the standard? [编辑#2]为了澄清这个问题，如果它是关于在int上操作而不是char的性能优势，那么为什么它在标准中？ This could have been given as a suggestion to compiler designers for better optimization.这可以作为对编译器设计人员进行更好优化的建议。 Now, if someone were to design a compiler which didn't do this that would make their compiler as one not adhering to the C/C++ standard fully, even though, hypothetically this compiler did support all other required features of the language.现在，如果有人要设计一个不这样做的编译器，那会使他们的编译器不完全遵守 C/C++ 标准，即使假设这个编译器确实支持该语言的所有其他必需特性。 In a nutshell, I cannot figure out a reason for why I cannot operate directly over unsigned chars , therefore the requirement to promote them to ints , seems unnecessary.简而言之，我无法弄清楚为什么我不能直接对unsigned chars进行操作，因此将它们提升为ints的要求似乎没有必要。 Can you give me an example which proves this wrong?你能举个例子证明这是错误的吗？

1 个解决方案

You can find this document on-line: Rationale for International Standard - Programming Languages - C (Revision 5.10, 2003) .您可以在线找到此文档：国际标准的基本原理 - 编程语言 - C（修订版 5.10，2003）。

Chapter 6.3 (p. 44 - 45) is about conversions第 6.3 章（第 44 - 45 页）是关于转换的

Between the publication of K&R and the development of C89, a serious divergence had occurred among implementations in the evolution of integer promotion rules.在 K&R 的发布和 C89 的开发之间，integer 提升规则的演进中的实现之间出现了严重的分歧。 Implementations fell into two major camps which may be characterized as unsigned preserving and value preserving .实现分为两大阵营，其特征可能是未签名保留和价值保留。

The difference between these approaches centered on the treatment of unsigned char and unsigned short when widened by the integer promotions, but the decision had an impact on the typing of constants as well (see §6.4.4.1).当 integer 促销活动扩大时，这些方法之间的差异集中在unsigned char和unsigned short的处理上，但该决定也对常量的类型产生了影响（参见 §6.4.4.1）。

The unsigned preserving approach calls for promoting the two smaller unsigned types to unsigned int .无符号保留方法要求将两个较小的无符号类型提升为unsigned int 。 This is a simple rule, and yields a type which is independent of execution environment.这是一个简单的规则，并产生一个独立于执行环境的类型。

The value preserving approach calls for promoting those types to signed int if that type can properly represent all the values of the original type, and otherwise for promoting those types to unsigned int .如果该类型可以正确表示原始类型的所有值，则值保留方法要求将这些类型提升为signed int ，否则将这些类型提升为unsigned int 。

Thus, if the execution environment represents short as something smaller than int , unsigned short becomes int ;因此，如果执行环境将short表示为小于int的东西，则unsigned short变为int ； otherwise it becomes unsigned int .否则它变成unsigned int 。 Both schemes give the same answer in the vast majority of cases, and both give the same effective result in even more cases in implementations with two's complement arithmetic and quiet wraparound on signed overflow - that is, in most current implementations.两种方案在绝大多数情况下都给出了相同的答案，并且在使用二进制补码算法和有符号溢出时的安静环绕的实现中甚至在更多情况下都给出了相同的有效结果 - 也就是说，在大多数当前实现中。 In such implementations, differences between the two only appear when these two conditions are both true:在这样的实现中，只有当这两个条件都为真时，两者之间的差异才会出现：

An expression involving an unsigned char or unsigned short produces an int -wide result in which the sign bit is set, that is, either a unary operation on such a type, or a binary operation in which the other operand is an int or “narrower” type.涉及unsigned char或unsigned short的表达式产生一个int范围的结果，其中设置了符号位，也就是说，对这种类型的一元运算，或者另一个操作数是int或更窄的二元运算“ 类型。

The result of the preceding expression is used in a context in which its signedness is significant:前面表达式的结果在其符号很重要的上下文中使用：
• sizeof(int) < sizeof(long) and it is in a context where it must be widened to a long type, or • sizeof(int) < sizeof(long)并且在上下文中必须将其扩展为 long 类型，或者
• it is the left operand of the right-shift operator in an implementation where this shift is defined as arithmetic, or • 在该移位被定义为算术的实现中，它是右移位运算符的左操作数，或
• it is either operand of /, %, <, <=, >, or >=. • 它是/、%、<、<=、> 或>= 的操作数。

In such circumstances a genuine ambiguity of interpretation arises.在这种情况下，就会出现真正的解释歧义。 The result must be dubbed questionably signed, since a case can be made for either the signed or unsigned interpretation.结果必须被称为可疑签名，因为可以对已签名或未签名的解释进行案例。 Exactly the same ambiguity arises whenever an unsigned int confronts a signed int across an operator, and the signed int has a negative value.每当一个unsigned int在一个运算符中遇到一个有signed int并且有signed int具有负值时，就会出现完全相同的歧义。 Neither scheme does any better, or any worse, in resolving the ambiguity of this confrontation.在解决这种对抗的模棱两可方面，这两种方案都没有做得更好或更糟。 Suddenly, the negative signed int becomes a very large unsigned int , which may be surprising, or it may be exactly what is desired by a knowledgeable programmer.突然，负的有signed int变成了一个非常大的unsigned int ，这可能令人惊讶，或者这可能正是知识渊博的程序员所需要的。 Of course, all of these ambiguities can be avoided by a judicious use of casts .当然，所有这些歧义都可以通过明智地使用 casts 来避免。

One of the important outcomes of exploring this problem is the understanding that high-quality compilers might do well to look for such questionable code and offer (optional) diagnostics, and that conscientious instructors might do well to warn programmers of the problems of implicit type conversions.探索这个问题的重要成果之一是理解高质量的编译器可能会很好地寻找这些有问题的代码并提供（可选的）诊断，并且认真的讲师可能会很好地警告程序员隐式类型转换的问题.

The unsigned preserving rules greatly increase the number of situations where unsigned int confronts signed int to yield a questionably signed result, whereas the value preserving rules minimize such confrontations.无符号保留规则大大增加了无符号整数与有signed int unsigned int对峙以产生有问题的有符号结果的情况的数量，而价值保留规则则最大限度地减少了这种冲突。 Thus, the value preserving rules were considered to be safer for the novice, or unwary, programmer.因此，价值保留规则被认为对新手或粗心的程序员更安全。 After much discussion, the C89 Committee decided in favor of value preserving rules, despite the fact that the UNIX C compilers had evolved in the direction of unsigned preserving.经过多次讨论，C89 委员会决定支持值保留规则，尽管事实上 UNIX C 编译器已经朝着无符号保留的方向发展。

QUIET CHANGE IN C89 C89 中的安静变化

A program that depends upon unsigned preserving arithmetic conversions will behave differently, probably without complaint.依赖于无符号保留算术转换的程序将表现不同，可能没有抱怨。 This was considered the most serious semantic change made by the C89 Committee to a widespread current practice.这被认为是 C89 委员会对当前普遍做法所做的最严重的语义更改。

For reference, you can find more details about those conversions updated to C11 in this answer by Lundin .作为参考，您可以在Lundin的这个答案中找到有关更新到 C11 的转换的更多详细信息。