简体   繁体   中英

Using strtok to split a string in c

In the code below I try to split a string to get names separated by "," from the string.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


int main(void)
{
    char *st = "Addison,Jayden,Sofia,Michael,Andrew,Lily,Benjamin";
    char *name[7];
    char *separate = ",";

    char *token = strtok(st, separate);
    int i = 0;
    while(token != NULL)
    {
        strcpy(name[i], token);
        token = strtok(NULL, ",");
        i++;
    }
    for (int j = 0; j < 7; j++)
    {
        printf("%s\n", name[j]);
    }
}

However, I run into segmentation fault when I try to run the code. I try to debug it and it seems that this specific line of code is where the error comes from:

char *token = strtok(st, separate);

Can anyone please tell me what did I do wrong?

You declared a pointer to a string literal

char *st = "Addison,Jayden,Sofia,Michael,Andrew,Lily,Benjamin";

You may not change string literals. Any attempt to changve a string literal results in undefined behavior.

From the C Standard (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined

On the other hand, the standard C function strtok changes the passed to it string.

From the C Standard (7.23.5.8 The strtok function)

4 The strtok function then searches from there for a character that is contained in the current separator string. If no such character is found, the current token extends to the end of the string pointed to by s1, and subsequent searches for a token will return a null pointer. If such a character is found, it is overwritten by a null character, which terminates the current token. The strtok function saves a pointer to the following character, from which the next search for a token will start.

So substitute the declaration for

char st[] = "Addison,Jayden,Sofia,Michael,Andrew,Lily,Benjamin";

Also you declared an array of pointers that is not initialized.

char *name[7];

So this statement

strcpy(name[i], token);

also invokes undefined behavior.

Instead you could just write

name[i] = token;

And as the variable i contains the number of tfokens then in general this loop

for (int j = 0; j < 7; j++)
{
    printf("%s\n", name[j]);
}

should be rewritten at least like (without using the magic number 7)

for (int j = 0; j < i; j++)
{
    puts( name[j] );
}

However, I run into segmentation fault when I try to run the code

Two problems:

  • Attempting to change a non-editable string.
  • Attempt to access location not owned by process.

Either of these can result in a crash. (seg-fault)

Memory allocation problem :
The declaration:

  char *name[7];

Creates an array of 7 pointers. Each of them require memory to be allocated before it can be used in this way:

strcpy(name[i], token);

Memory allocation example:

for(int i=0;i<7;i++)
{
    name[i] = malloc((maxNameLen+1)*sizeof(name[i]));
    if(!name[i]) 
    {
        //handle error
    }
}
//now each pointer has space for up to maxNameLen+1 characters

Non-editable string :

char *st = "Addison,Jayden,Sofia,Michael,Andrew,Lily,Benjamin";

A string to be edited cannot be in this form (ie a string literal ). It must be in an editable form in order for strtok() to change it, as it will when parsing. There are many string forms that are editable, Here are two examples:

//Create an editable form:
char st[] = {"Addison,Jayden,Sofia,Michael,Andrew,Lily,Benjamin"};

//Create an editable copy of original: (requires freeing when done using)
char *duplicate = strdup(st);
if(duplicate)
{
    //success duplicating, now parse duplicate.

There is another discussion here explaining editable/non-editable strings in C.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM