[英]confusion about int, char, and EOF in C
I'm learning K&R's classic C programming book 2nd edition, here's an example on page 17: 我正在学习K&R的经典C编程书第二版,这是第17页的示例:
#include <stdio.h>
/* copy input to output*/
main()
{
int c;
// char c works as well!!
while ((c = getchar()) != EOF)
putchar(c);
}
it's stated in the book that int c
is used to hold EOF
, which turns out to be -1
in my Windows machine with GCC and can't be represented by char
. 在书中指出,
int c
用于保存EOF
,在我带有GCC的Windows机器中,结果为-1
,不能用char
表示。 However, when I tried char c
it works with no problem. 但是,当我尝试使用
char c
它没有问题。 Curiously I tried some more: 奇怪的是我尝试了更多:
int a = EOF;
char b = EOF;
char e = -1;
printf("%d %d %d %c %c %c \n", a, b, e, a, b, e);
and the output is -1 -1 -1
with no character displayed (actually according to ASCII table for %c, c
here there should be a nbs(no-break space)
displayed but it's invisible). 并且输出为
-1 -1 -1
,不显示任何字符(实际上根据%c, c
ASCII表%c, c
此处%c, c
应该显示nbs(no-break space)
但它是不可见的)。
So how can char
be assigned with EOF
without any compiler error? 那么如何在没有任何编译器错误的情况下为
EOF
分配char
?
Moreover, given that EOF
is -1
, are both b
and e
above assigned FF
in memory? 此外,假设
EOF
为-1
,则b
和e
是否都在内存中分配给FF
以上? It should not be otherwise how can compiler distinguish EOF
and nbs
...? 否则编译器如何区分
EOF
和nbs
...?
Update : 更新 :
most likely EOF 0xFFFFFFFF
is cast to char 0xFF
but in (c = getchar()) != EOF
the the LHS 0xFF
is int promoted to 0xFFFFFFFF
before comparison so type of c
can be either int
or char
. 最有可能的
EOF 0xFFFFFFFF
被强制转换为char 0xFF
但是在(c = getchar()) != EOF
中,LHS 0xFF
在比较之前被int提升为0xFFFFFFFF
,因此c
类型可以是int
或char
。
In this case EOF
happens to be 0xFFFFFFFF
but theoretically EOF
can be any value that requires more than 8 bits to correctly represent with left most bytes not necessarily being FFFFFF
so then char c
approach will fail. 在这种情况下,
EOF
恰好是0xFFFFFFFF
但理论上EOF
可以是任何需要8位以上的值才能正确表示的最左端字节(不一定是FFFFFF
因此char c
方法将失败。
Reference: K&R The C Programming Language 2e 参考:K&R C编程语言2e
This code works because you're using signed char
s. 该代码有效,因为您使用的是签名
char
。 If you look at an ASCII table you'll find two things: first, there are only 127 values. 如果查看ASCII表 ,则会发现两件事:首先,只有127个值。 127 takes seven bits to represent, and the top bit is the sign bit.
127需要七个位来表示,而最高位是符号位。 Secondly,
EOF
is not in this table, so the OS is free to define it as it sees fit. 其次,
EOF
不在此表中,因此OS可以自由定义它。
The assignment from char
to int
is allowed by the compiler because you're assigning from a small type to a larger type. 编译器允许从
char
到int
赋值,因为您是从较小的类型分配为较大的类型。 int
is guaranteed to be able to represent any value a char
can represent. 保证
int
能够表示char
可以表示的任何值。
Note also that 0xFF
is equal to 255 when interpreted as an unsigned char
and -1 when interpreted as a signed char
: 还需要注意的是
0xFF
时解释为等于255 unsigned char
和-1时,解释为一个signed char
:
0b11111111
However, when represented as a 32 bit integer, it looks very different: 但是,当表示为32位整数时,它看起来非常不同:
255 : 0b00000000000000000000000011111111
-127: 0b11111111111111111111111110000001
EOF
and 0xFF
are not the same. EOF
和0xFF
不相同。 So compiler has to distinguish between them. 因此,编译器必须区分它们。 If you see the man page for
getchar()
, you'd know that it returns the character read as an unsigned char cast to an int or EOF on end of file or error. 如果您看到
getchar()
的手册页 ,您会知道它会在文件或错误结束时将读取的字符作为无符号字符返回到int或EOF。
Your while((c = getchar()) != EOF)
is expanded to 您的
while((c = getchar()) != EOF)
被扩展为
((unsigned int)c != (unsigned int)EOF)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.