简体   繁体   English

在C中使用regexec / strtok_r时出现分段错误

[英]Segmentation fault when using regexec/strtok_r in C

I'm having problems in figuring out where and why I'm receiving a segmentation fault. 我在弄清楚我收到分段错误的位置和原因时遇到了问题。

I'm writing a C code that prompts the user to input a regular expression and compile it and then enter a string with multiple sentences: 我正在编写一个C代码,提示用户输入正则表达式并编译它,然后输入一个包含多个句子的字符串:

int main(void){

  char RegExp[50];
  regex_t CompiledRegExp;
  char *para;
  char delim[] = ".!?,";
  char *sentence;
  char *ptr1;

  printf("Enter regular expression: ");
  fgets(RegExp, 50, stdin);

if (regcomp(&CompiledRegExp,RegExp,REG_EXTENDED|REG_NOSUB) != 0) {                        

    printf("ERROR: Something wrong in the regular expression\n");                         

    exit(EXIT_FAILURE);                                                                   

  }

  printf("\nEnter string: ");

strtok_r is used to split the string with either of the following delimiters .,?! strtok_r用于将字符串拆分为以下任一分隔符。,?! and then the resulting token (sentence) is used as the string parameter in the regexec function that searches it to see if the regular expression previously compiled is contained within the token: 然后生成的标记(句子)用作regexec函数中的字符串参数,该函数搜索它以查看先前编译的正则表达式是否包含在标记中:

if( fgets(para, 1000, stdin)){

    char *ptr = para;
    sentence = strtok_r(ptr, delim, &ptr1);

    while(sentence != NULL){

      printf("\n%s", sentence);

      if (regexec(&CompiledRegExp,sentence,(size_t)0,NULL,0) == 0) {
        printf("\nYes");
      } else {
        printf("\nNo");
      }
      ptr = ptr1;
      sentence = strtok_r(ptr, delim, &ptr1);

    }
  }
regfree(&CompiledRegExp);
}

It's probably a silly mistake I'm making but any help in locating the reasons of the segfaul would be greatly appreciated! 这可能是我犯的一个愚蠢的错误,但任何帮助找到segfaul的原因将不胜感激!

EDIT: Moved regfree to a more suitable location. 编辑: regfree移动到更合适的位置。 However, segfault still occurring. 但是,段错误仍在发生。 I'm pretty sure It has something got to do with either how the regular expression is being read in or how it is being compared in regexec . 我很确定它与正则表达式的读取方式或regexec中的比较方式有关。 Clueless, though. 但是,无能为力。

Instead of this: 而不是这个:

char *para;
fgets(para, 1000, stdin);

Write this: 写这个:

char para[1000];
fgets(para, 1000, stdin);

In the first variant, para is a pointer that points somewhere in memory, and to this somewhere the user-entered string is written. 在第一个变体中, para是指向内存中某处的指针,并且在某处用户输入的字符串被写入。 Most probably, para points to some address that is invalid, crashing your program immediately. 最有可能的是, para指向一些无效的地址,会立即导致程序崩溃。

You called regfree inside the loop. 你在循环中调用了regfree。 The second time around the loop you call regexec on freed memory with undefined behavior. 围绕循环第二次使用未定义的行为在释放的内存上调用regexec。

You are using strtok_r() incorrectly. 您正在使用strtok_r()错误。

To parse a string with strtok_r() , in the first call the first argument is a pointer to the string you want parsed. 要使用strtok_r()解析字符串,在第一次调用中,第一个参数是指向要解析的字符串的指针。 Subsequent calls to strtok_r() to parse the same same string should have NULL passed as the first argument. 后续调用strtok_r()来解析相同的字符串应该将NULL作为第一个参数传递。 What you're doing: 你在做什么:

ptr = ptr1;  
sentence = strtok_r(ptr, delim, &ptr1); 

makes no sense. 没有意义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM