简体   繁体   English

关于C中的int,char和EOF的困惑

[英]confusion about int, char, and EOF in C

I'm learning K&R's classic C programming book 2nd edition, here's an example on page 17: 我正在学习K&R的经典C编程书第二版,这是第17页的示例:

#include <stdio.h>
/* copy input to output*/
main()
{
    int c; 
    // char c works as well!!
    while ((c = getchar()) != EOF)
        putchar(c);
}

it's stated in the book that int c is used to hold EOF , which turns out to be -1 in my Windows machine with GCC and can't be represented by char . 在书中指出, int c用于保存EOF ,在我带有GCC的Windows机器中,结果为-1 ,不能用char表示。 However, when I tried char c it works with no problem. 但是,当我尝试使用char c它没有问题。 Curiously I tried some more: 奇怪的是我尝试了更多:

int  a = EOF;
char b = EOF;
char e = -1;
printf("%d %d %d %c %c %c \n", a, b, e, a, b, e);

and the output is -1 -1 -1 with no character displayed (actually according to ASCII table for %c, c here there should be a nbs(no-break space) displayed but it's invisible). 并且输出为-1 -1 -1 ,不显示任何字符(实际上根据%c, c ASCII表%c, c此处%c, c应该显示nbs(no-break space)但它是不可见的)。

So how can char be assigned with EOF without any compiler error? 那么如何在没有任何编译器错误的情况下为EOF分配char

Moreover, given that EOF is -1 , are both b and e above assigned FF in memory? 此外,假设EOF-1 ,则be是否都在内存中分配给FF以上? It should not be otherwise how can compiler distinguish EOF and nbs ...? 否则编译器如何区分EOFnbs ...?

Update : 更新

most likely EOF 0xFFFFFFFF is cast to char 0xFF but in (c = getchar()) != EOF the the LHS 0xFF is int promoted to 0xFFFFFFFF before comparison so type of c can be either int or char . 最有可能的EOF 0xFFFFFFFF被强制转换为char 0xFF但是在(c = getchar()) != EOF中,LHS 0xFF在比较之前被int提升为0xFFFFFFFF ,因此c类型可以是intchar

In this case EOF happens to be 0xFFFFFFFF but theoretically EOF can be any value that requires more than 8 bits to correctly represent with left most bytes not necessarily being FFFFFF so then char c approach will fail. 在这种情况下, EOF恰好是0xFFFFFFFF但理论上EOF可以是任何需要8位以上的值才能正确表示的最左端字节(不一定是FFFFFF因此char c方法将失败。

Reference: K&R The C Programming Language 2e 参考:K&R C编程语言2e

在此处输入图片说明

This code works because you're using signed char s. 该代码有效,因为您使用的是签名 char If you look at an ASCII table you'll find two things: first, there are only 127 values. 如果查看ASCII表 ,则会发现两件事:首先,只有127个值。 127 takes seven bits to represent, and the top bit is the sign bit. 127需要七个位来表示,而最高位是符号位。 Secondly, EOF is not in this table, so the OS is free to define it as it sees fit. 其次, EOF不在此表中,因此OS可以自由定义它。

The assignment from char to int is allowed by the compiler because you're assigning from a small type to a larger type. 编译器允许从charint赋值,因为您是从较小的类型分配为较大的类型。 int is guaranteed to be able to represent any value a char can represent. 保证int能够表示char可以表示的任何值。

Note also that 0xFF is equal to 255 when interpreted as an unsigned char and -1 when interpreted as a signed char : 还需要注意的是0xFF时解释为等于255 unsigned char和-1时,解释为一个signed char

0b11111111

However, when represented as a 32 bit integer, it looks very different: 但是,当表示为32位整数时,它看起来非常不同:

255 : 0b00000000000000000000000011111111
-127: 0b11111111111111111111111110000001

EOF and 0xFF are not the same. EOF0xFF不相同。 So compiler has to distinguish between them. 因此,编译器必须区分它们。 If you see the man page for getchar() , you'd know that it returns the character read as an unsigned char cast to an int or EOF on end of file or error. 如果您看到getchar()手册页 ,您会知道它会在文件或错误结束时将读取的字符作为无符号字符返回到int或EOF。

Your while((c = getchar()) != EOF) is expanded to 您的while((c = getchar()) != EOF)被扩展为

((unsigned int)c != (unsigned int)EOF)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM