简体   繁体   中英

sign extension in C, char>unsigned char

When i was reading K&R, i am confused in this code:

#include "syscalls.h"
int getchar(void)
{
    char c;

    return (read(0, &c, 1) == 1) ? (unsigned char)c : EOF;
}

It is said unsigned char used for avoiding the wrong brought by sign extension in this code. This is the only case i can think of,and i give this example code:

char c = 0xf0; //11110000, just make highest bit > 1
printf("%i\n",(int)(unsigned char)c);
printf("%i\n",(int)c);

Output:  240 // 0...011110000
         -16 // 1...111110000

But in fact ascii is just 0~127 highest bit can not be assigned to 1.Why in K&R cast char >> unsigned char?

ASCII is limited to the range 0..127 but it's not only ASCII that can be read by read - in K&R, it could get the entire 0..255 range of char values.

That's why getchar returned an int , because it had to be able to return any char value plus a special EOF value that was distinct from all other characters.

By casting the character to an unsigned char before promoting it to an int on return, it prevented the values 128..255 being sign-extended. If you allowed that sign extension, you would not have been able to tell the difference between 255 (which would sign extend to all 1-bits) and EOF (which was -1, all 1-bits).


I'm not entirely certain your strategy of using K&R to learn the language is a good one by the way. C has come a long way since those days. From memory, even the latest K&R book was still for the C89/90 ANSI standard (before ISO basically took over responsibility) and the language has been through two massive upgrades since then.

unsigned char variables have values between 0 and 255 and for the requirement of typecasting please follow comment from the same book

Whether plain chars are signed or unsigned is machine-dependent, but printable characters are always positive.
return (read(0, &c, 1) == 1) ? (unsigned char)c : EOF;

means: read one char into c; iif you could read at least one char, return it; otherwise return (the int) EOF.

note that getchar() returns an int, thus the conversion is char->unsigned char->int

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM