简体   繁体   English

为什么编译器可以在C ++中将char转换为int?

[英]Why can a compiler convert char to int in C++?

Through programming in java and now in C++, I have found that you can convert a char to an int and then a int to a double . 通过使用Java编程以及现在使用C ++进行编程,我发现您可以将char转换为int ,然后将int转换为double

I want to know why is that a char can be converted to an int ? 我想知道为什么可以将char转换为int

In C++ they don't have the same amount of memory allocated to them, char is 8 bits and a int is 32 bits. 在C ++中,它们没有分配给它们相同数量的内存, char是8位,而int是32位。 So how does this work? 那么这是如何工作的呢?

Is that just how the compiler is setup? 这就是编译器的设置方式吗? I just want an explanation. 我只想要一个解释。

Thanks for any and all help!! 感谢您提供的所有帮助!!

On some architectures, every value that a char can hold, an int can also hold. 在某些体系结构上, char可以容纳的每个值, int也可以容纳。 So if you have a char , you can use it to initialize an int by giving the int the same value that the char holds. 因此,如果您有一个char ,则可以通过赋予intchar相同的值来使用它来初始化int This shouldn't be surprising. 这并不奇怪。

On other architectures this isn't true. 在其他架构上,情况并非如此。 Still C++ allows any integer type to be converted to any other integer type. C ++仍然允许将任何整数类型转换为任何其他整数类型。 This must be true because it was allowed in C also, but you can prevent such "narrowing" conversions using brace-initialization. 这必须是正确的,因为在C中也允许它,但是您可以使用大括号初始化来防止这种“缩小”的转换。

What I think you're concerned about---which should be a concern on any architecture---is take a char and "pass it off" as an int . 我认为您所关心的-(对于任何体系结构都应考虑的)-是将一个char并“传递给它”作为int But that's not what converting char to int does. 但这不是将char转换为int作用。 It is what converting to int& using reinterpret_cast would do; 就是使用reinterpret_cast转换为int& such a conversion is dangerous and potentially triggers undefined behaviour; 这种转换是危险的,并可能触发不确定的行为; not only because int and char don't have the same size, but also because they may not have the same representation even if they do have the same size. 不仅因为intchar不具有相同的大小,还因为它们即使具有相同的大小也可能不具有相同的表示形式。

Promotion between numerical types is legitimate by the C++ standard. 根据C ++标准,数字类型之间的提升是合法的。 Meaning, this feature is part of the C++ language. 意味着,此功能是C ++语言的一部分。

Specifically for the promotions you are referring, a char can be promoted to an int and an int can be promoted to a double without loss of precision because the standard guarantees that: 专门针对您所涉及的促销,可以将char提升为int并将int提升为double而不损失精度,因为该标准保证:

sizeof(char) <= sizeof(int) <= sizeof(double)

That is the range of values represented by char is surely included in the range of values represented by int and the range of values represented by int surely is included in the range of values supported by double . 也就是说, char表示的值的范围肯定包含在int表示的值的范围内,而int表示的值范围肯定包含在double所支持的值的范围内。

Frozen history. 冻结的历史。 C++ was based on C, and still almost has C as a proper subset. C ++基于C,但仍然几乎将C作为适当的子集。 And in the 1970s, when C was developed, there was no big difference between a byte and a character : almost all, if not absolutely all, the main text encodings used a single byte per character. 在1970年代,当C语言被开发出来时, 字节字符之间没有太大的区别:几乎所有(如果不是绝对的话),主要的文本编码每个字符使用一个字节。

In modern C++ char and its two variants unsigned char and signed char is just the basic byte type, where a byte is defined as the smallest adressable unit of memory, and when used for characters char is just the basic encoding unit (eg with UTF-8 a character consists of one to five, I think it was, bytes). 在现代C ++ char及其两个变体中, unsigned charsigned char只是基本字节类型,其中字节被定义为内存的最小可寻址单位,当用于字符时, char只是基本编码单位(例如,使用UTF- 8个字符由1到5个字节(我认为是字节)组成。

Since and including with the standardization in 1998 C++ has acquired three more purposed character types: wchar_t , char16_t and char32_t , but unfortunately no strongly typed such type. 自从1998年(包括该标准化工作)以来,C ++已经获得了另外三种专用的字符类型: wchar_tchar16_tchar32_t ,但不幸的是,没有强类型的字符类型。


The compiler options or setup don't affect whether char converts implicitly to integer, but they affect whether plain char is a signed or unsigned type. 编译器选项或设置不会影响char 是否隐式转换为整数,但会影响plain char是带符号类型还是无符号类型。 Usually it's signed, also for historical reasons, which is impractical. 通常,出于历史原因,它是签名的,这是不切实际的。 As a signed type it's still distinct from signed char , eg with respect to overload resolution of a function call, and as an unsigned type it's distinct from unsigned char . 作为带符号类型,它仍然与带signed char ,例如,在函数调用的重载解析方面,而作为无符号类型,则与unsigned char


You can define a strongly typed byte size character encoding value type by using an enum : 您可以使用enum定义强类型字节大小的字符编码值类型:

using Byte = unsigned char;
enum class Byte_char : Byte  {};

“Strongly typed” means that it doesn't convert implicitly to number. “强类型”表示不会隐式转换为数字。

However, I prefer the more relaxed type checking of 但是,我更喜欢对类型进行更轻松的类型检查

enum Byte_char : Byte  {};

which converts to integer, but is a type distinct from uses of Byte for other purposes (this doesn't mean that I use a Byte_char type, it's just about what I find practical when defining such a type). 它将转换整数,但其类型不同于将Byte用于其他目的(这并不意味着我使用Byte_char类型,这与在定义此类时发现的实际情况有关)。

As far as size is concerned such a type can in practice also be defined as a struct , because as far as I know no extant C++ compiler inserts padding in a single byte struct . 就大小而言,实际上也可以将这种类型定义为struct ,因为据我所知,没有现存的C ++编译器在单个字节struct插入填充。 However, such a definition can be incompatible with the “short buffer optimization” of std::basic_string . 但是,这样的定义可能与std::basic_string的“短缓冲区优化”不兼容。 The enum works fine with that optimization. enum与该优化一起正常工作。

char is not converted to int . char不会转换为int The ASCII code of char is assigned for int value. char的ASCII码分配给int值。 If you convert the char '5' to int , you have to get integer value 5. But you will get 53 which is the code of '5'. 如果将char '转换为int ,则必须获得整数值5。但是您将得到53,这是代码'5'。 The byte holding ASCII code is directly extended to 32 bit value. 保存ASCII码的字节直接扩展为32位值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM