简体   繁体   English

"C中的符号扩展,char>unsigned char"

[英]sign extension in C, char>unsigned char

When i was reading K&R, i am confused in this code:当我阅读 K&R 时,我对这段代码感到困惑:

#include "syscalls.h"
int getchar(void)
{
    char c;

    return (read(0, &c, 1) == 1) ? (unsigned char)c : EOF;
}

It is said unsigned char used for avoiding the wrong brought by sign extension in this code.据说 unsigned char 用于避免代码中符号扩展带来的错误。 This is the only case i can think of,and i give this example code:这是我能想到的唯一情况,我给出这个示例代码:

char c = 0xf0; //11110000, just make highest bit > 1
printf("%i\n",(int)(unsigned char)c);
printf("%i\n",(int)c);

Output:  240 // 0...011110000
         -16 // 1...111110000

But in fact ascii is just 0~127 highest bit can not be assigned to 1.Why in K&R cast char >> unsigned char?但实际上ascii只是0~127的最高位不能分配给1。为什么在K&R cast char >> unsigned char?

ASCII is limited to the range 0..127 but it's not only ASCII that can be read by read - in K&R, it could get the entire 0..255 range of char values. ASCII限制在0..127范围内,但它不仅仅是可以通过read读取的 ASCII - 在 K&R 中,它可以获得整个0..255范围的char值。

That's why getchar returned an int , because it had to be able to return any char value plus a special EOF value that was distinct from all other characters.这就是getchar返回int的原因,因为它必须能够返回任何char值以及与所有其他字符不同的特殊EOF值。

By casting the character to an unsigned char before promoting it to an int on return, it prevented the values 128..255 being sign-extended.通过将字符转换为unsigned char在返回时将其提升为int ,它可以防止值128..255被符号扩展。 If you allowed that sign extension, you would not have been able to tell the difference between 255 (which would sign extend to all 1-bits) and EOF (which was -1, all 1-bits).如果您允许该符号扩展,您将无法区分 255(将符号扩展至所有 1 位)和EOF (即 -1,所有 1 位)之间的区别。


I'm not entirely certain your strategy of using K&R to learn the language is a good one by the way.顺便说一句,我不完全确定您使用 K&R 学习语言的策略是一个好的策略。 C has come a long way since those days.从那时起,C 已经走过了漫长的道路。 From memory, even the latest K&R book was still for the C89/90 ANSI standard (before ISO basically took over responsibility) and the language has been through two massive upgrades since then.从记忆中,即使是最新的 K&R 书籍仍然是 C89/90 ANSI 标准(在 ISO 基本上接管之前),并且从那时起该语言已经经历了两次大规模升级。

unsigned char variables have values between 0 and 255 and for the requirement of typecasting please follow comment from the same book unsigned char 变量的值介于 0 和 255 之间,对于类型转换的要求,请遵循同一本书的评论

Whether plain chars are signed or unsigned is machine-dependent, but printable characters are always positive.
return (read(0, &c, 1) == 1) ? (unsigned char)c : EOF;

means: read one char into c;意思是:将一个字符读入c; iif you could read at least one char, return it; i如果您至少可以读取一个字符,请返回它; otherwise return (the int) EOF.否则返回(int)EOF。

note that getchar() returns an int, thus the conversion is char->unsigned char->int注意 getchar() 返回一个 int,因此转换是 char->unsigned char->int

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM