简体   繁体   中英

Function to split string sometimes gives segmentation fault

I have the following function to split a string. Most of the time it works fine, but sometimes it randomly causes a segmentation fault.

char** splitString(char* string, char* delim){
    int count = 0;
    char** split = NULL;
    char* temp = strtok(string, delim);

    while(temp){
        split = realloc(split, sizeof(char*) * ++count);

        split[count - 1] = temp;
        temp = strtok(NULL, " ");
    }

    int i = 0;
    while(split[i]){
        printf("%s\n", split[i]);
        i++;
    }

    split[count - 1][strlen(split[count - 1]) - 1] = '\0';
    return split;
}
split[count - 1][strlen(split[count - 1]) - 1] = '\0';

should look like

split[count - 1] = NULL;

You don't have anything allocated there so that you can access it and put '\\0'.

After that put that line before while(split[i]) so that the while can stop when it reaches NULL.

函数strtok不是可重入的,请使用strtok_r()函数,这是可重入的版本strtok()。

You have a number of subtle issues, not the least of which your function will segfault if you pass a string literal. You need to make a copy of the string you will be splitting as strtok modifies the string. If you pass a string literal (stored in read-only memory), your compiler has no way of warning unless you have declared string as const char *string;

To avoid these problems, simply make a copy of the string you will tokeninze. That way, regardless how the string you pass to the function was declared, you avoid the problem altogether.

You should also pass a pointer to size_t as a parameter to your function in order to make the number of token available back in the calling function. That way you do not have to leave a sentinel NULL as the final pointer in the pointer to pointer to char you return. Just pass a pointer and update it to reflect the number of tokens parsed in your function.

Putting those pieces together, and cleaning things up a bit, you could use the following to do what you are attempting to do:

char **splitstr (const char *str, char *delim, size_t *n)
{
    char *cpy = strdup (str), *p = cpy; /* copy of str & pointer */
    char **split = NULL;                /* pointer to pointer to char */
    *n = 0;                             /* zero 'n' */

    for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
        void *tmp = realloc (split, sizeof *split * (*n + 1));
        if (!tmp) { /* validate realloc succeeded */
            fprintf (stderr, "splitstr() error: memory exhausted.\n");
            break;
        }
        split = tmp;                /* assign tmp to split */
        split[(*n)++] = strdup (p); /* allocate/copy to split[n] */
    }
    free (cpy);     /* free cpy */
    return split;   /* return split */
}

Adding a short example program, you could do the following:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char **splitstr (const char *str, char *delim, size_t *n)
{
    char *cpy = strdup (str), *p = cpy; /* copy of str & pointer */
    char **split = NULL;                /* pointer to pointer to char */
    *n = 0;                             /* zero 'n' */

    for (p = strtok (p, delim); p; p = strtok (NULL, delim)) {
        void *tmp = realloc (split, sizeof *split * (*n + 1));
        if (!tmp) { /* validate realloc succeeded */
            fprintf (stderr, "splitstr() error: memory exhausted.\n");
            break;
        }
        split = tmp;                /* assign tmp to split */
        split[(*n)++] = strdup (p); /* allocate/copy to split[n] */
    }
    free (cpy);     /* free cpy */
    return split;   /* return split */
}

int main (void) {

    size_t n = 0;                   /* number of strings */
    char *s = "My dog has fleas.",  /* string to split */
        *delim = " .\n",            /* delims */
        **strings = splitstr (s, delim, &n);    /* split s */

    for (size_t i = 0; i < n; i++) {    /* output results */
        printf ("strings[%zu] : %s\n", i, strings[i]);
        free (strings[i]);          /* free string */
    }
    free (strings);     /* free pointers */

    return 0;
}

Example Use/Output

$ ./bin/splitstrtok
strings[0] : My
strings[1] : dog
strings[2] : has
strings[3] : fleas

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

$ valgrind ./bin/splitstrtok
==14471== Memcheck, a memory error detector
==14471== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==14471== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==14471== Command: ./bin/splitstrtok
==14471==
strings[0] : My
strings[1] : dog
strings[2] : has
strings[3] : fleas
==14471==
==14471== HEAP SUMMARY:
==14471==     in use at exit: 0 bytes in 0 blocks
==14471==   total heap usage: 9 allocs, 9 frees, 115 bytes allocated
==14471==
==14471== All heap blocks were freed -- no leaks are possible
==14471==
==14471== For counts of detected and suppressed errors, rerun with: -v
==14471== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Look things over and let me know if you have further questions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM