In Linux with c, I didn't understant what is the diffrence between char*
and unsigned char*
When I reading/writing binary buffer?
When I must not using char*
and need to use unsigned char*
?
First recall C has unsigned char
, signed char
and char
: 3 distinct types. char
has the same range as either unsigned char
or signed char
.
[Edit]
OP added "When I reading/writing binary buffer" so the far below sections (my original post) deals with "what is the difference between char*
and unsigned char*
" with a sample case without that r/w concern. Within this section....
Reading/writing binary via <stdio.h>
can be done with any I/O function although it is more common to to use fread()/fwite()
.
For byte orientated data, all I/O functions behave as if
The byte input functions read characters from the stream as if by successive calls to the
fgetc
function. C17dr § 7.21.3 11
The byte output functions write characters to the stream as if by successive calls to thefputc
function. § 7.21.3 12
So let us look at those two.
... the
fgetc
function obtains that character as anunsigned char
... § 7.21.7.1 2
Thefputc
function writes the character specified by c (converted to anunsigned char
) § 7.21.7.3 2
Thus all I/O at the lowest level is best thought of as reading/writing unsigned char
.
Now to directly address
When I must not using
char*
and need to useunsigned char*
? (OP)
With writing, pointers such as char*
, unsigned char*
or others can be used at OP level code, yet the underlying output function accesses data via unsigned char *
. This has no impact on OP's execution of the write other than if char
was encoded as ones' complement/sign magnitude - a trap code would not get detected.
Likewise with reading, the underlying input function saves data via unsigned char *
and no traps occur. A single byte read via int fgetc()
would report values in the unsigned char
range even if char
is signed .
The importance of using unsigned char*
vs. char*
in reading/writing binary buffer comes not so much in the I/O call itself (it all unsigned char *
access), but in the setting up of data prior to writing and the interpretation of data after reading - see memcmp()
below.
When I must not using
char*
and need to useunsigned char*
?
A good example is with string related code.
Although functions in <string.h>
use char*
in function parameters, the implementations performs as if char
was unsigned char
, even when char
is signed .
For all functions in this subclause, each character shall be interpreted as if it had the type
unsigned char
(and therefore every possible object representation is valid and has a different value). C17dr § 7.24.1 3
So even if char
is a signed char
, functions like int strcmp(char *a, char *b)
perform as if int strcmp(unsigned char *a, unsigned char *b)
.
This makes a difference when string differ by a signed char c
and char d
with values of different signs.
Eg Assume c < 0, d > 0
// Accessed via char *
and char
is signed c < d is true // Accessed via unsigned char *
c > d is false
This results in a different sign from the strcmp()
return and so affects sorting strings.
// Incorrect code when `char` is signed.
int strcmp(const char *a, const char *b) {
while (*a == *b && *a) { a++; b++; }
return (*a > *b) - (*a < *b);
}
// Correct code when `char` is signed or unsigned, 2's complement or not
int strcmp(const char *a, const char *b) {
const char *ua = a;
const char *ub = b;
while (*ua == *ub && *ua) { ua++; ub++; }
return (*ua > *ub) - (*ua < *ub);
}
[Edit]
The like-wise applies to binary data read and compared with memcmp()
.
+0 ended a string when properly view as a unsigned char
. -0 is not a null character to terminate a string, even though as a signed char
it has a value of zero.
// Incorrect code when `char` is signed and not 2's complement.
// Conversion to `unsigned char` done too late.
int strcmp(const char *a, const char *b) {
while ((unsigned char)*a == (unsigned char)*b && (unsigned char)*a) { a++; b++; }
return ((unsigned char)*a > (unsigned char)*b) - ((unsigned char)*a < (unsigned char)*b);
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.