将行拆分为单词并使用 strtok 将它们放入 char 数组中

Question

我有这个简单的行解析器到令牌功能......但我缺少一些东西。

int parse_line(char *line,char **words){

   int wordc=0;

   /* get the first token */
   char *word = strtok(line, " ");
   words[wordc]=(char*)malloc(256*sizeof(char));
   strcpy(words[wordc++],word );

   /* walk through other tokens */
    while( word != NULL ) {
        word = strtok(NULL, " ");
        words[wordc]=(char*)malloc(256*sizeof(char));
        strcpy(words[wordc++],word );
    }

    return wordc;
}

当我运行它时，出现分段错误。 我给出第一个参数 char[256] 行，第二个参数当然是 char** 字，但我有第一个 malloc 内存。 像那样

  char **words = (char **)malloc(256 * sizeof(char *));

main:
.
.
.
char buffer[256];
char **words = (char **)malloc(256 * sizeof(char *));
.
.
.
n = read(stdin, buffer, 255);
if (n < 0){
   perror("ERROR");
   break;
}

parse_line(buffer,words);

当程序执行 parse_line 时，它以分段错误退出

找到段错误发生的地方。 它就在那一行：

strcpy(words[wordc++],word );

特别是在第一个 strcpy 上。 在它甚至到达 while 循环之前

Answer 1

while( word != NULL ) {
    word = strtok(NULL, " ");
    words[wordc]=(char*)malloc(256*sizeof(char));
    strcpy(words[wordc++],word );
}

在该行的末尾， word将始终设置为NULL （如预期的那样），因此strcpy(words[wordc++],word )将是未定义的行为（可能是崩溃）。

您需要重新组织循环，这样您就永远不会尝试复制 NULL 字符串。

@jxh 建议使用此解决方案，该解决方案解决了您的任一strcpy中word为NULL的问题。

/* get the first token */
char *word = strtok(line, " ");

while( word != NULL ) {
    words[wordc]=(char*)malloc(256*sizeof(char));
    strcpy(words[wordc++],word );
    word = strtok(NULL, " ");
}

我会这样做（使用更少的内存）

/* get the first token */
char *word = strtok(line, " ");

while( word != NULL ) {
    words[wordc++] = strdup(word);
    word = strtok(NULL, " ");
}

Answer 2

以下建议代码：

干净地编译
执行所需的功能
正确检查错误
向用户显示结果
无法将所有分配的内存传递给free() ，因此有很多内存泄漏

现在建议的代码：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

// avoid 'magic' numbers in code
#define MAX_WORDS 256
#define MAX_LINE_LEN 256


int parse_line( char *line, char **words )
{
    int wordc=0;

    /* get the first token */
    char *token = strtok(line, " ");
    while( wordc < MAX_WORDS && token ) 
    {   
        words[wordc] = strdup( token );
        if( ! words[wordc] )
        {
            perror( "strdup failed" );
            exit( EXIT_FAILURE );
        }

        // implied else, strdup successful

        wordc++;

        // get next token
        token = strtok(NULL, " ");
    }

    return wordc;
}



int main( void )
{
    char buffer[ MAX_LINE LENGTH ];

    // fix another problem with OPs code
    char **words = calloc( MAX_WORDS, sizeof( char* ) );
    if( ! words )
    {
        perror( "calloc failed" );
        exit( EXIT_FAILURE );
    }

    // implied else, calloc successful

    // note: would be much better to use 'fgets()' rather than 'read()'
    ssize_t n = read( 0, buffer, sizeof( buffer ) );
    if (n <= 0)
    {
       perror("read failed");
       exit( EXIT_FAILURE );
    }

    // implied else, read successful

    // note: 'read()' does not NUL terminate the data
    buffer[ n ] = '\0';   

    int count = parse_line( buffer, words );

    for( int i = 0; i < count; i++ )
    {   
        printf( "%s\n", words[i] );
    } 
}

这是该程序的典型运行：

hello old friend  <-- user entered line
hello
old
friend

Answer 3

你的答案是正确的。 但是我又因为读了 segF 了？？？？？ 我没有注意到当我运行程序时它没有停止读取输入？ 相反，它正在通过它。 我所做的是将 read 更改为 fgets 并且它起作用了！！！ 还有你的改变！ 有人可以给我解释一下吗？？？？ 为什么它不会在读取功能处停止？

将行拆分为单词并使用 strtok 将它们放入 char 数组中

问题描述

3 个解决方案

解决方案1
3 已采纳 2020-01-21 22:54:32

解决方案2
1 2020-01-22 00:24:48

解决方案3
0 2020-01-22 00:22:07

将行拆分为单词并使用 strtok 将它们放入 char 数组中

问题描述

3 个解决方案

解决方案1 3 已采纳 2020-01-21 22:54:32

解决方案2 1 2020-01-22 00:24:48

解决方案3 0 2020-01-22 00:22:07

解决方案1
3 已采纳 2020-01-21 22:54:32

解决方案2
1 2020-01-22 00:24:48

解决方案3
0 2020-01-22 00:22:07