简体   繁体   中英

What is the correct behavior of strtol?

I'm creating a wrapper function around strtol() for int16_t , and I ran across a problem: how to handle input such as 0x ? That is, is it valid as digit 0 , followed by non-digit x , or is it invalid because nothing follows 0x ?

tl;dr results: Windows rejects 0x completely as described in the latter case, but Debian sees it as the digit 0 , followed by the non-digit character x , as explained in the former case.

Implementations tested 1

  • Windows
    • Visual C++ 2015 (henceforth MSVC)
    • MinGW-w64 GCC (5.2.0)
  • Debian

For further comparison purposes, I included sscanf() with a format specification of " %li" , which is supposed to act like strtol() with base==0 . Surprisingly, I ended up with two different results.

Code:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    const char *foo = "        0x";
    char *remainder;
    long n;
    int base = 0;

    n = strtol(foo, &remainder, base);
    printf("strtol(): ");
    if (*remainder == 'x')
        printf("OK\n");
    else {
        printf("NOMATCH\n");
        printf("  Remaining -- %s\n", remainder);
        printf("  errno: %s (%d)\n", strerror(errno), errno);
    }

    errno = 0;

    base = sscanf(foo, " %li", &n);
    printf("sscanf(): ");
    if (base == 1)
        printf("OK\n");
    else {
        printf("NOMATCH\n");
        printf("  Returned: %s\n", base==0 ? "0" : "EOF");
        printf("  errno: %s (%d)\n", strerror(errno), errno);
    }
}

Debian results:

strtol(): OK
sscanf(): OK

Windows results:

strtol(): NOMATCH
  Remaining --         0x
  errno: No error (0)
sscanf(): NOMATCH
  Returned: 0
  errno: No error (0)

There was no error in any case, yet the results differed. Initializing base to 16 instead of 0 made no difference at all, nor did removal of the leading blanks in the test string.

I honestly expected the result that I got on Debian: 0 is parsed as a valid value (whether base is 0 or 16), and remainder is set to point to x after seeing there were no valid hexadecimal values immediately following x (had there been any, the 0x would be skipped whether base is 0 or 16).

So now I'm confused about the correct behavior in this situation. Is either of these behaviors in violation of the C standard? My interpretation of the relevant sections of the standard is that Debian is correct, but I'm really not certain.


1 Cygwin exhibited the Windows behavior in the case of strtol() and the Debian behavior in the case of sscanf() for those who are interested. Since the behavior is of %li is supposed to match strtol() with base==0 , I considered this a bug and ignored its results.

The specification from 7.22.1.4 of C11:

  1. If the value of base is zero, the expected form of the subject sequence is that of an integer constant as described in 6.4.4.1,

  2. The subject sequence is defined as the longest initial subsequence of the input string, starting with the first non-white-space character, that is of the expected form

As described in 6.4.4.1, 0 is an integer constant, but 0x is not. Therefore, the subject sequence is 0 for your input case.

So the correct behaviour is to return 0 , and endptr should be left pointing to the x .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM