为什么在C的字符串输入中使用类型为int的变量？

Question

I am working through methods of input and output in C, and I have been presented with a segment of code that has an element that I cannot understand. 我正在研究C语言中的输入和输出方法，并且向我展示了一段代码，其中包含我无法理解的元素。 The purport of this code is to show how the 'echoing' and 'buffered' input/outputs work, and in the code, they have a type 'int' declared for, as I understand, characters: 这段代码的目的是显示“回显”和“缓冲”输入/输出的工作方式，并且在代码中，它们声明为一种类型为“ int”的字符，据我所知，这些字符是：

#include <stdio.h>

int main(void){

    int ch;    //This is what I do not get: why is this type 'int'?

    while((ch = getchar()) != '\n'){
        putchar(ch);
    }
    return 0;
}

I'm not on firm footing with type casting as it is, and this 'int' / 'char' discrepancy is undermining all notions that I have regarding data types and compatibility. 我对类型转换并没有坚定的立场，这种“ int” /“ char”的差异正在破坏我对数据类型和兼容性的所有观念。

Answer 1

getchar() returns an int type because it is designed to be able to return a value that cannot be represented by char to indicate EOF . getchar()返回一个int类型，因为它被设计为能够返回无法用char表示的值来指示EOF 。 (C.11 §7.21.1 ¶3 and §7.21.7.6 ¶3) （C.11§7.21.1¶3和§7.21.7.6¶3）

Your looping code should take into account that getchar() might return EOF : 您的循环代码应考虑到getchar()可能返回EOF ：

while((ch = getchar()) != EOF){
    if (ch != '\n') putchar(ch);
}

Answer 2

The getc , fgetc and getchar functions return int because they are capable of handling binary data, as well as providing an in-band signal of an error or end-of-data condition. getc ， fgetc和getchar函数返回int，是因为它们能够处理二进制数据，并提供错误或数据终止条件的带内信号。

Except on certain embedded platforms which have an unusual byte size, the type int is capable of representing all of the byte values from 0 to UCHAR_MAX as positive values. 除了在某些具有不寻常字节大小的嵌入式平台上之外， int类型能够将从0到UCHAR_MAX所有字节值UCHAR_MAX为正值。 In addition, it can represent negative values, such as the value of the constant EOF . 另外，它可以表示负值，例如常数EOF的值。

The type unsigned char would only be capable of representing the values 0 to UCHAR_MAX , and so the functions would not be able to use the return value as a way of indicating the inability to read another byte of data. unsigned char类型只能表示0到UCHAR_MAX的值，因此函数将不能使用返回值作为表示无法读取另一字节数据的方式。 The value EOF is convenient because it can be treated as if it were an input symbol; EOF值很方便，因为可以将其视为输入符号。 for instance it can be included in a switch statement which handles various characters. 例如，它可以包含在处理各种字符的switch语句中。

There is a little bit more to this because in the design of C, values of short and char type (signed or unsigned) undergo promotion when they are evaluated in expressions. 还有更多的事情，因为在C的设计中，当在表达式中求值时， short和char类型（有符号或无符号）的值会得到提升。

In classic C, before prototypes were introduced, when you pass a char to a function, it's actually an int value which is passed. 在经典C语言中，在引入原型之前，将char传递给函数时，实际上是传递的int值。 Concretely: 具体来说：

int func(c)
char c;
{
   /* ... */
}

This kind of old style definition does not introduce information about the parameter types. 这种旧样式定义不会引入有关参数类型的信息。 When we call this as func(c) , where c has type char , the expression c is subject to the usual promotion, and becomes a value of type int . 当我们将其称为func(c) ，其中c类型为char ，表达式c会接受通常的提升，并成为int类型的值。 This is exactly the type which is expected by the above function definition. 这正是上述函数定义所期望的类型。 A parameter of type char actually passes through as a value of type int . char类型的参数实际上作为int类型的值传递。 If we write an ISO C prototype declaration for the above function, it has to be, guess what: 如果我们为上述函数编写ISO C原型声明，则必须这样：

int func(int); /* not int func(char) */

Another legacy is that character constants like 'A' actually have type int and not char . 另一个遗产是像'A'这样的字符常量实际上具有int类型而不是char类型。 It is noteworthy that this changes in C++, because C++ has overloaded functions. 值得注意的是，这在C ++中有所改变，因为C ++具有重载的函数。 Given the overloads: 鉴于重载：

void f(int);
void f(char);

we want f(3) to call the former, and f('A') to call the latter. 我们希望f(3)调用前者，而f('A')调用后者。

So the point is that the designers of C basically regarded char as being oriented toward representing a compact storage location , and the smallest addressable unit of memory. 因此，关键是C的设计人员基本上将char定向为表示紧凑的存储位置和最小的可寻址内存单元。 But as far as data manipulation in the processor was concerned, they were thinking of the values as being word-sized int values: that character processing is essentially data manipulation based on int . 但是就处理器中的数据操作而言，他们认为这些值是字大小的int值：字符处理本质上是基于int数据操作。

This is one of the low-level facets of C. In machine languages on byte-addressable machines, we usually think of bytes as being units of storage, and when we load the into registers to work with them, they occupy a full register, and so become 32 bit values (or what have you). 这是C 语言的低级方面之一。在字节寻址机器上的机器语言中，我们通常认为字节是存储单元，当我们将寄存器加载到寄存器中以与它们一起工作时，它们会占用完整的寄存器，因此变成32位值（或您拥有的值）。 This is mirrored in the concept of promotion in C. 这反映在C晋升的概念中。

Answer 3

The return type of getchar() is int . getchar()的返回类型为int 。 It returns the ASCII code of the character it's just read. 它返回刚刚读取的字符的ASCII码。 This is (and I know someone's gonna correct me on this) the same as the char representation, so you can freely compare them and so on. 这与char表示形式相同（并且我知道有人会对此进行纠正），因此您可以自由比较它们，依此类推。

Why is it this way? 为什么这样呢？ The getchar() function is ancient -- from the very earliest days of K&R C. putchar() similarly takes an int argument, when you'd think it might take a char . getchar()函数很古老-从K＆R C的最早时代开始， putchar()同样采用int参数，当您认为可能采用char 。

Hope that helps! 希望有帮助！

为什么在C的字符串输入中使用类型为int的变量？

问题描述

3 个解决方案

解决方案1
4 2013-10-25 00:25:02

解决方案2
3 已采纳 2013-10-25 00:37:19

解决方案3
1 2013-10-25 00:23:41

为什么在C的字符串输入中使用类型为int的变量？

问题描述

3 个解决方案

解决方案1 4 2013-10-25 00:25:02

解决方案2 3 已采纳 2013-10-25 00:37:19

解决方案3 1 2013-10-25 00:23:41

解决方案1
4 2013-10-25 00:25:02

解决方案2
3 已采纳 2013-10-25 00:37:19

解决方案3
1 2013-10-25 00:23:41