简体   繁体   中英

Parsing character array to words held in pointer array (C-programming)

I am trying to separate each word from a character array and put them into a pointer array, one word for each slot. Also, I am supposed to use isspace() to detect blanks. But if there is a better way, I am all ears. At the end of the code I want to print out the content of the parameter array.

Let's say the line is: "this is a sentence". What happens is that it prints out "sentence" (the last word in the line, and usually followed by some random character) 4 times (the number of words). Then I get "Segmentation fault (core dumped)".

Where am I going wrong?

int split_line(char line[120])
{
    char *param[21]; // Here I want to put one word for each slot
    char buffer[120]; // Word buffer
    int i; // For characters in line
    int j = 0; // For param words   
    int k = 0; // For buffer chars

    for(i = 0; i < 120; i++)    
    {
        if(line[i] == '\0')
            break;
        else if(!isspace(line[i]))   
        {    
            buffer[k] = line[i];
            k++;    
        }    
        else if(isspace(line[i]))
        {
            buffer[k+1] = '\0';
            param[j] = buffer; // Puts word into pointer array   
            j++;    
            k = 0;    
        }   
        else if(j == 21)    
        {
            param[j] = NULL;    
            break;    
        }    
    }    

    i = 0;
    while(param[i] != NULL)    
    {    
        printf("%s\n", param[i]);    
        i++;    
    }    
    return 0;    
}

There are many little problems in this code :

  • param[j] = buffer; k = 0; : you rewrite at the beginning of buffer erasing previous words
  • if(!isspace(line[i])) ... else if(isspace(line[i])) ... else ... : isspace(line[i]) is either true of false, and you always use the 2 first choices and never the third.
  • if (line[i] == '\\0') : you forget to terminate current word by a '\\0'
  • if there are multiple white spaces, you currently (try to) add empty words in param

Here is a working version :

int split_line(char line[120])

{

    char *param[21]; // Here I want to put one word for each slot
    char buffer[120]; // Word buffer
    int i; // For characters in line
    int j = 0; // For param words
    int k = 0; // For buffer chars
    int inspace = 0;

    param[j] = buffer;

    for(i = 0; i < 120; i++) {
        if(line[i] == '\0') {
            param[j++][k] = '\0';
            param[j] = NULL;
            break;
        }
        else if(!isspace(line[i])) {
            inspace = 0;
            param[j][k++] = line[i];
        }
        else if (! inspace) {
            inspace = 1;
            param[j++][k] = '\0';
            param[j] = &(param[j-1][k+1]);
            k = 0;
            if(j == 21) {
                param[j] = NULL;
                break;
            }
        }
    }



    i = 0;

    while(param[i] != NULL)

    {

        printf("%s\n", param[i]);

        i++;

    }

    return 0;

}

I only fixed the errors. I leave for you as an exercise the following improvements :

  • the split_line routine should not print itself but rather return an array of words - beware you cannot return an automatic array, but it would be another question
  • you should not have magic constants in you code ( 120 ), you should at least have a #define and use symbolic constants, or better accept a line of any size - here again it is not simple because you will have to malloc and free at appropriate places, and again would be a different question

Anyway good luck in learning that good old C :-)

This line does not seems right to me

param[j] = buffer;

because you keep assigning the same value buffer to different param[j] s .

I would suggest you copy all the char s from line[120] to buffer[120] , then point param[j] to location of buffer + Next_Word_Postition .

You may want to look at strtok in string.h . It sounds like this is what you are looking for, as it will separate words/tokens based on the delimiter you choose. To separate by spaces, simply use:

dest = strtok(src, " ");

Where src is the source string and dest is the destination for the first token on the source string. Looping through until dest == NULL will give you all of the separated words, and all you have to do is change dest each time based on your pointer array. It is also nice to note that passing NULL for the src argument will continue parsing from where strtok left off, so after an initial strtok outside of your loop, just use src = NULL inside. I hope that helps. Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM