[英]sign extension in C, char>unsigned char
When i was reading K&R, i am confused in this code:当我阅读 K&R 时,我对这段代码感到困惑:
#include "syscalls.h"
int getchar(void)
{
char c;
return (read(0, &c, 1) == 1) ? (unsigned char)c : EOF;
}
It is said unsigned char used for avoiding the wrong brought by sign extension in this code.据说 unsigned char 用于避免代码中符号扩展带来的错误。 This is the only case i can think of,and i give this example code:
这是我能想到的唯一情况,我给出这个示例代码:
char c = 0xf0; //11110000, just make highest bit > 1
printf("%i\n",(int)(unsigned char)c);
printf("%i\n",(int)c);
Output: 240 // 0...011110000
-16 // 1...111110000
But in fact ascii is just 0~127 highest bit can not be assigned to 1.Why in K&R cast char >> unsigned char?但实际上ascii只是0~127的最高位不能分配给1。为什么在K&R cast char >> unsigned char?
ASCII is limited to the range 0..127
but it's not only ASCII that can be read by read
- in K&R, it could get the entire 0..255
range of char
values. ASCII被限制在
0..127
范围内,但它不仅仅是可以通过read
读取的 ASCII - 在 K&R 中,它可以获得整个0..255
范围的char
值。
That's why getchar
returned an int
, because it had to be able to return any char
value plus a special EOF
value that was distinct from all other characters.这就是
getchar
返回int
的原因,因为它必须能够返回任何char
值以及与所有其他字符不同的特殊EOF
值。
By casting the character to an unsigned char
before promoting it to an int
on return, it prevented the values 128..255
being sign-extended.通过将字符转换为
unsigned char
在返回时将其提升为int
,它可以防止值128..255
被符号扩展。 If you allowed that sign extension, you would not have been able to tell the difference between 255 (which would sign extend to all 1-bits) and EOF
(which was -1, all 1-bits).如果您允许该符号扩展,您将无法区分 255(将符号扩展至所有 1 位)和
EOF
(即 -1,所有 1 位)之间的区别。
I'm not entirely certain your strategy of using K&R to learn the language is a good one by the way.顺便说一句,我不完全确定您使用 K&R 学习语言的策略是一个好的策略。 C has come a long way since those days.
从那时起,C 已经走过了漫长的道路。 From memory, even the latest K&R book was still for the C89/90 ANSI standard (before ISO basically took over responsibility) and the language has been through two massive upgrades since then.
从记忆中,即使是最新的 K&R 书籍仍然是 C89/90 ANSI 标准(在 ISO 基本上接管之前),并且从那时起该语言已经经历了两次大规模升级。
unsigned char variables have values between 0 and 255 and for the requirement of typecasting please follow comment from the same book unsigned char 变量的值介于 0 和 255 之间,对于类型转换的要求,请遵循同一本书的评论
Whether plain chars are signed or unsigned is machine-dependent, but printable characters are always positive.
return (read(0, &c, 1) == 1) ? (unsigned char)c : EOF;
means: read one char into c;意思是:将一个字符读入c; iif you could read at least one char, return it;
i如果您至少可以读取一个字符,请返回它; otherwise return (the int) EOF.
否则返回(int)EOF。
note that getchar() returns an int, thus the conversion is char->unsigned char->int注意 getchar() 返回一个 int,因此转换是 char->unsigned char->int
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.