I'm creating a wrapper function around strtol()
for int16_t
, and I ran across a problem: how to handle input such as 0x
? That is, is it valid as digit 0
, followed by non-digit x
, or is it invalid because nothing follows 0x
?
tl;dr results: Windows rejects 0x
completely as described in the latter case, but Debian sees it as the digit 0
, followed by the non-digit character x
, as explained in the former case.
Implementations tested 1
For further comparison purposes, I included sscanf()
with a format specification of " %li"
, which is supposed to act like strtol()
with base==0
. Surprisingly, I ended up with two different results.
Code:
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
const char *foo = " 0x";
char *remainder;
long n;
int base = 0;
n = strtol(foo, &remainder, base);
printf("strtol(): ");
if (*remainder == 'x')
printf("OK\n");
else {
printf("NOMATCH\n");
printf(" Remaining -- %s\n", remainder);
printf(" errno: %s (%d)\n", strerror(errno), errno);
}
errno = 0;
base = sscanf(foo, " %li", &n);
printf("sscanf(): ");
if (base == 1)
printf("OK\n");
else {
printf("NOMATCH\n");
printf(" Returned: %s\n", base==0 ? "0" : "EOF");
printf(" errno: %s (%d)\n", strerror(errno), errno);
}
}
Debian results:
strtol(): OK
sscanf(): OK
Windows results:
strtol(): NOMATCH
Remaining -- 0x
errno: No error (0)
sscanf(): NOMATCH
Returned: 0
errno: No error (0)
There was no error in any case, yet the results differed. Initializing base
to 16 instead of 0 made no difference at all, nor did removal of the leading blanks in the test string.
I honestly expected the result that I got on Debian: 0
is parsed as a valid value (whether base
is 0 or 16), and remainder
is set to point to x
after seeing there were no valid hexadecimal values immediately following x
(had there been any, the 0x
would be skipped whether base
is 0 or 16).
So now I'm confused about the correct behavior in this situation. Is either of these behaviors in violation of the C standard? My interpretation of the relevant sections of the standard is that Debian is correct, but I'm really not certain.
1 Cygwin exhibited the Windows behavior in the case of strtol()
and the Debian behavior in the case of sscanf()
for those who are interested. Since the behavior is of %li
is supposed to match strtol()
with base==0
, I considered this a bug and ignored its results.
The specification from 7.22.1.4 of C11:
If the value of base is zero, the expected form of the subject sequence is that of an integer constant as described in 6.4.4.1,
The subject sequence is defined as the longest initial subsequence of the input string, starting with the first non-white-space character, that is of the expected form
As described in 6.4.4.1, 0
is an integer constant, but 0x
is not. Therefore, the subject sequence is 0
for your input case.
So the correct behaviour is to return 0
, and endptr
should be left pointing to the x
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.