简体   繁体   中英

what's wrong with this c function?

The following function has to split a line in 2 or more lines each of which is shorter than s.

char **splitline(FILE *fp, int s)
{
    char **l;
    char c;
    int ccounter;
    int lcounter;

    c = fgetc(fp);
    if (c == EOF)
    {
        return NULL;
    }

    lcounter=0;
    l = malloc(sizeof(char **));
    l[lcounter] = malloc((SIZE+2)*sizeof(char));

    ccounter = 0;
    while (c != EOF && c != '\n')
    {
        l[lcounter][ccounter] = c;
        ccounter++;
        c = fgetc(fp);

        if (ccounter == SIZE)
        {
            l[lcounter][ccounter] = '\n';
            ccounter++;
            l[lcounter][ccounter] = '\0';

            realloc(l, (lcounter+2) * sizeof(char **));

            lcounter++;

            l[lcounter] = malloc((SIZE+2) * sizeof(char));
            ccounter = 0;
        }
    }

    if (ccounter == 0)
    {
        l[lcounter][ccounter] = '\0'; 
    }
    else
    {
        l[lcounter][ccounter] = '\n';
        ccounter++;
        l[lcounter][ccounter] = '\0';

        realloc(l, (lcounter+2) * sizeof(char **));

        lcounter++;

        ccounter = 0;
        l[lcounter] = malloc((SIZE+2) * sizeof(char));
        l[lcounter][ccounter] = '\0';
    }

    return l;
}
  • You are using an undefined constant SIZE instead of the argument s to your function to control the maximum lengths of your lines. Since s is a signed integer, you should check it for sanity (it should be positive (non-zero) number, maybe not larger than 1 MiB; maybe you want to set a bigger lower limit than 1; maybe you default to 80 if the caller screws up, or maybe you return with an error).

  • You are using char c; to save the value read with fgetc() ; sadly, that means your EOF test is unreliable. Either you'll stop prematurely when someone supplies 'ÿ' (y-umlaut, hex 0xFF in ISO 8859-1, 8859-15, or U+00FF in ISO 10646 - Unicode) or you won't stop at all, depending on whether the type char is signed or unsigned. Always remember: getchar() and relatives return an int !

  • A variable l is inviting confusion with the constant 1 ; generally avoid it.

  • Your main loop condition would be better if you tested the result of fgetc() directly.

     int c; while ((c = fgetc(fp)) != EOF && c != '\\n') { ... } 

    As it is, you read a character part way through the loop and don't check it properly. You might then use ungetc() to push the character first read back into the input stream; it makes the input processing more uniform. Alternatively, you might set things up so that everything works correctly if the first call to fgetc() is in the loop control and it returns EOF.

  • As Tommy pointed out, you must capture the output of realloc() ; there is no guarantee that it will return its input pointer as the result. You should also learn now that you do not save the result of realloc() into the variable specified as its first argument. You immediately leak memory if you do and the reallocation fails (because you've lost the pointer to the old memory - it just got zeroed). So, to be safe, you use:

     char **new_array = realloc(l, (lcounter+1) * sizeof(*new_array)); if (new_array == 0) ...handle out of memory... else l = new_array; 

    A couple of points here. You allocated (lcounter+2) values, but I think you only ever use one of them (unless there's a terminal null pointer to mark the end of the array). You specified sizeof(char **) but actually you want an array of char * values. Fortunately, all (object) pointers are the same size - the C standard guarantees that; only POSIX guarantees that function pointers are the same size as object pointers (the C standard does not). So, sizeof(char **) == sizeof(char *) and you are safe, but you are not asking for what you wanted.

  • A corollary to the discussion about realloc() possibly failing is malloc() may fail too. You should error check your memory allocation - or use a set of cover functions for the standard library that only return if the pointer that is returned is not null. If you don't check it, your program will crash eventually because of a memory allocation failure - even on a machine with 24 GiB of main memory (though it might take a while to get to that point).

  • There is a lot of repetition in the code. You should look to avoid that. It means using a sub-function, perhaps, to manage the memory allocations.

If you fix those, you're in with a fighting chance of getting a working function. You should also write the code to release the allocated memory that you return, so that the users doesn't have to devise a method to do so on your behalf. Always worry about who is going to release allocated memory and how it will be released.


Chris Lutz asked:

And where does the standard guarantee that all object pointer types are the same size?

I believe the relevant section of the (C99 - ISO/IEC 9899:1999) standard is:

§6.3.2.3 Pointers

1 A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

That basically says all object pointer types can be converted to void pointers and back again without loss of information, which means that they must all be the same size. Note that the category 'object pointers' does not include 'function pointers'.

The relevant section of the POSIX standard (about function pointers) is §2.12.3 (at the bottom of the linked section).

The main one I can spot: realloc may not keep your buffer at the same address. You should use

 l = realloc(...)

Actually, see the comment of Matteo below. Realloc may do one of three things:

  • extend the size of the buffer you already have (in which case you'll get the same pointer back);
  • allocate a new buffer elsewhere, copy your buffer's contents across and free the original (in which case you'll get a different pointer back);
  • find itself unable to get you a buffer of the required size anywhere and fail, leaving your current contents intact (returning NULL).

In the final case, assignment without checking the return value would cause a memory leak.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM