简体   繁体   English

声明字符时char和int之间的区别

[英]Difference between char and int when declaring character

I just started learning C and am rather confused over declaring characters using int and char. 我刚开始学习C,并且对使用int和char声明字符感到困惑。

I am well aware that any characters are made up of integers in the sense that the "integers" of characters are the characters' respective ASCII decimals. 我很清楚,任何字符都是由整数组成的,因为字符的“整数”是字符各自的ASCII小数。

That said, I learned that it's perfectly possible to declare a character using int without using the ASCII decimals. 就是说,我了解到,完全可以使用int声明字符而无需使用ASCII小数。 Eg. 例如。 declaring variable test as a character 'X' can be written as: 将变量test声明为字符'X'可以写成:

char test = 'X';

and

int test = 'X';

And for both declaration of character, the conversion characters are %c (even though test is defined as int ). 对于这两个字符声明,转换字符都是%c (即使test被定义为int )。

Therefore, my question is/are the difference(s) between declaring character variables using char and int and when to use int to declare a character variable? 因此,我的问题是使用charint声明字符变量与何时使用int声明字符变量之间的区别?

The difference is the size in byte of the variable, and from there the different values the variable can hold. 区别在于变量的字节大小,从那里可以容纳变量的不同值。

A char is required to accept all values between 0 and 127 (included). 需要一个char来接受0到127(包括)之间的所有值。 So in common environments it occupies exactly one byte (8 bits). 因此,在普通环境中,它恰好占据一个字节(8位)。 It is unspecified by the standard whether it is signed (-128 - 127) or unsigned (0 - 255). 标准未指定是带符号(-128-127)还是无符号(0-255)。

An int is required to be at least a 16 bits signed word, and to accept all values between -32767 and 32767. That means that an int can accept all values from a char, be the latter signed or unsigned. 一个int至少必须是一个16位带符号的字,并且必须接受-32767和32767之间的所有值。这意味着一个int可以接受来自char的所有值,无论是char是带符号的还是无符号的。

If you want to store only characters in a variable, you should declare it as char . 如果只想在变量中存储字符,则应将其声明为char Using an int would just waste memory, and could mislead a future reader. 使用int只会浪费内存,并可能误导未来的读者。 One common exception to that rule is when you want to process a wider value for special conditions. 该规则的一个常见例外是,当您想为特殊条件处理更大的值时。 For example the function fgetc from the standard library is declared as returning int : 例如,标准库中的函数fgetc被声明为返回int

int fgetc(FILE *fd);

because the special value EOF (for End Of File) is defined as the int value -1 (all bits to one in a 2-complement system) that means more than the size of a char. 因为特殊值EOF (用于文件末尾)被定义为int值-1(2补码系统中的所有位为1),其含义大于char的大小。 That way no char (only 8 bits on a common system) can be equal to the EOF constant. 这样,任何字符(在通用系统上只有8位)都不能等于EOF常数。 If the function was declared to return a simple char , nothing could distinguish the EOF value from the (valid) char 0xFF. 如果声明该函数返回一个简单的char ,则无法将EOF值与(有效)char 0xFF区分开。

That's the reason why the following code is bad and should never be used: 这就是为什么以下代码很糟糕并且不应该使用的原因:

char c;    // a terrible memory saving...
...
while ((c = fgetc(stdin)) != EOF) {   // NEVER WRITE THAT!!!
    ...
}

Inside the loop, a char would be enough, but for the test not to succeed when reading character 0xFF, the variable needs to be an int. 在循环内部,一个char就足够了,但是为了使读取字符0xFF时测试不成功,该变量必须为int。

The char type has multiple roles. char类型具有多个角色。

The first is that it is simply part of the chain of integer types, char , short , int , long , etc., so it's just another container for numbers. 首先是它只是整数类型链( charshortintlong等等)的一部分,因此它只是数字的另一个容器。

The second is that its underlying storage is the smallest unit, and all other objects have a size that is a multiple of the size of char ( sizeof returns a number that is in units of char , so sizeof char == 1 ). 第二个原因是其基础存储是最小单位,所有其他对象的大小是char大小的倍数( sizeof返回以char为单位的数字,因此sizeof char == 1 )。

The third is that it plays the role of a character in a string, certainly historically. 第三个是它在字符串中扮演角色的角色,当然在历史上。 When seen like this, the value of a char maps to a specified character, for instance via the ASCII encoding, but it can also be used with multi-byte encodings (one or more char s together map to one character). 当这样看到时, char的值映射到指定的字符,例如通过ASCII编码,但是它也可以与多字节编码一起使用(一个或多个char一起映射到一个字符)。

Usually you should declare characters as char and use int for integers being capable of holding bigger values. 通常,您应该将字符声明为char,并将int用于能够容纳更大值的整数。 On most systems a char occupies a byte which is 8 bits. 在大多数系统上,char占用一个8位字节。 Depending on your system this char might be signed or unsigned by default, as such it will be able to hold values between 0-255 or -128-127. 根据您的系统,默认情况下,此字符可能是有符号的或无符号的,因此它可以保存0-255或-128-127之间的值。

An int might be 32 bits long, but if you really want exactly 32 bits for your integer you should declare it as int32_t or uint32_t instead. 一个int可能是32位长,但是如果您确实希望整数精确地为32位,则应将其声明为int32_t或uint32_t。

在大多数体系结构中, int大小为4个字节,而char的大小为1个字节。

I think there's no difference, but you're allocating extra memory you're not going to use. 我认为没有什么区别,但是您正在分配不使用的额外内存。 You can also do const long a = 1; 你也可以做const long a = 1; , but it will be more suitable to use const char a = 1; ,但使用const char a = 1;更合适const char a = 1; instead. 代替。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM