简体   繁体   中英

Segmentation fault when using regexec/strtok_r in C

I'm having problems in figuring out where and why I'm receiving a segmentation fault.

I'm writing a C code that prompts the user to input a regular expression and compile it and then enter a string with multiple sentences:

int main(void){

  char RegExp[50];
  regex_t CompiledRegExp;
  char *para;
  char delim[] = ".!?,";
  char *sentence;
  char *ptr1;

  printf("Enter regular expression: ");
  fgets(RegExp, 50, stdin);

if (regcomp(&CompiledRegExp,RegExp,REG_EXTENDED|REG_NOSUB) != 0) {                        

    printf("ERROR: Something wrong in the regular expression\n");                         

    exit(EXIT_FAILURE);                                                                   

  }

  printf("\nEnter string: ");

strtok_r is used to split the string with either of the following delimiters .,?! and then the resulting token (sentence) is used as the string parameter in the regexec function that searches it to see if the regular expression previously compiled is contained within the token:

if( fgets(para, 1000, stdin)){

    char *ptr = para;
    sentence = strtok_r(ptr, delim, &ptr1);

    while(sentence != NULL){

      printf("\n%s", sentence);

      if (regexec(&CompiledRegExp,sentence,(size_t)0,NULL,0) == 0) {
        printf("\nYes");
      } else {
        printf("\nNo");
      }
      ptr = ptr1;
      sentence = strtok_r(ptr, delim, &ptr1);

    }
  }
regfree(&CompiledRegExp);
}

It's probably a silly mistake I'm making but any help in locating the reasons of the segfaul would be greatly appreciated!

EDIT: Moved regfree to a more suitable location. However, segfault still occurring. I'm pretty sure It has something got to do with either how the regular expression is being read in or how it is being compared in regexec . Clueless, though.

Instead of this:

char *para;
fgets(para, 1000, stdin);

Write this:

char para[1000];
fgets(para, 1000, stdin);

In the first variant, para is a pointer that points somewhere in memory, and to this somewhere the user-entered string is written. Most probably, para points to some address that is invalid, crashing your program immediately.

You called regfree inside the loop. The second time around the loop you call regexec on freed memory with undefined behavior.

You are using strtok_r() incorrectly.

To parse a string with strtok_r() , in the first call the first argument is a pointer to the string you want parsed. Subsequent calls to strtok_r() to parse the same same string should have NULL passed as the first argument. What you're doing:

ptr = ptr1;  
sentence = strtok_r(ptr, delim, &ptr1); 

makes no sense.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM