简体   繁体   中英

Why would a type int variable be used in a string input in C?

I am working through methods of input and output in C, and I have been presented with a segment of code that has an element that I cannot understand. The purport of this code is to show how the 'echoing' and 'buffered' input/outputs work, and in the code, they have a type 'int' declared for, as I understand, characters:

#include <stdio.h>

int main(void){

    int ch;    //This is what I do not get: why is this type 'int'?

    while((ch = getchar()) != '\n'){
        putchar(ch);
    }
    return 0;
}

I'm not on firm footing with type casting as it is, and this 'int' / 'char' discrepancy is undermining all notions that I have regarding data types and compatibility.

getchar() returns an int type because it is designed to be able to return a value that cannot be represented by char to indicate EOF . (C.11 §7.21.1 ¶3 and §7.21.7.6 ¶3)

Your looping code should take into account that getchar() might return EOF :

while((ch = getchar()) != EOF){
    if (ch != '\n') putchar(ch);
}

The getc , fgetc and getchar functions return int because they are capable of handling binary data, as well as providing an in-band signal of an error or end-of-data condition.

Except on certain embedded platforms which have an unusual byte size, the type int is capable of representing all of the byte values from 0 to UCHAR_MAX as positive values. In addition, it can represent negative values, such as the value of the constant EOF .

The type unsigned char would only be capable of representing the values 0 to UCHAR_MAX , and so the functions would not be able to use the return value as a way of indicating the inability to read another byte of data. The value EOF is convenient because it can be treated as if it were an input symbol; for instance it can be included in a switch statement which handles various characters.

There is a little bit more to this because in the design of C, values of short and char type (signed or unsigned) undergo promotion when they are evaluated in expressions.

In classic C, before prototypes were introduced, when you pass a char to a function, it's actually an int value which is passed. Concretely:

int func(c)
char c;
{
   /* ... */
}

This kind of old style definition does not introduce information about the parameter types. When we call this as func(c) , where c has type char , the expression c is subject to the usual promotion, and becomes a value of type int . This is exactly the type which is expected by the above function definition. A parameter of type char actually passes through as a value of type int . If we write an ISO C prototype declaration for the above function, it has to be, guess what:

int func(int); /* not int func(char) */

Another legacy is that character constants like 'A' actually have type int and not char . It is noteworthy that this changes in C++, because C++ has overloaded functions. Given the overloads:

void f(int);
void f(char);

we want f(3) to call the former, and f('A') to call the latter.

So the point is that the designers of C basically regarded char as being oriented toward representing a compact storage location , and the smallest addressable unit of memory. But as far as data manipulation in the processor was concerned, they were thinking of the values as being word-sized int values: that character processing is essentially data manipulation based on int .

This is one of the low-level facets of C. In machine languages on byte-addressable machines, we usually think of bytes as being units of storage, and when we load the into registers to work with them, they occupy a full register, and so become 32 bit values (or what have you). This is mirrored in the concept of promotion in C.

The return type of getchar() is int . It returns the ASCII code of the character it's just read. This is (and I know someone's gonna correct me on this) the same as the char representation, so you can freely compare them and so on.

Why is it this way? The getchar() function is ancient -- from the very earliest days of K&R C. putchar() similarly takes an int argument, when you'd think it might take a char .

Hope that helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM