简体   繁体   中英

How to find unique word number of occurrences?

Can't figure out the bug in my code. Every time I input a sentence, the count does increment but the word adds the first letter of the previous word and increments one letter every time. How do I fix this?

void numberOfWordOccurrences(char str[MAX_CHAR]) {
int count = 0, i = 0, j = 0;
  char uniqueToken[99][999];
  int tokenCount[99] = {0}; 
  while(str[i] != '\0') {
    char token[999];  
    while(str[i] != ' ' && str[i] != '\0') {
      token[j++] = str[i++];
    }
    if(token[j - 1] == ':' || token[j - 1] == ',' || token[j - 1] == '.' || token[j - 1] == ';' || token[j - 1] == '?' || token[j - 1] == '!') {
      token[j - 1] = '\0';
    }
    //null 
    token[j] = '\0';
    //flag 
    int flag = -1; 
    for(j = 0; j < count; j++) {
      if(strcmp(uniqueToken[j], token) == 0) {
        //if flag is valid, then...
        flag = j;
        tokenCount[flag] = token[flag] + 1;
        break;
      }
    }
    if(flag <= 1) {
      tokenCount[count] = tokenCount[count] + 1;
      strcpy(uniqueToken[count++], token);
    }
    i++;
  }
}``` 

first you have to set j=0 inside of your main while loop,otherwise when you go inside of this loop for(j = 0; j < count; j++) j will in increase, so here token[j++] = str[i++]; you won't start to copy str in token from j=0 that is why you have previous words letters.

second I believe this condition if(flag <= 1) should be if(flag == -1) because if for example first and fifth word are similar flag would be 0 and again that string would be copied in uniqueToken .

also pay attention if you reach \0 you with your two i++ you will pass it and here while(str[i] != '\0') you won't check it so I suggest while(str[i-1] != '\0') also before sending string check if there is anything in it(in a case str[0]='\0' .

look

void numberOfWordOccurrences(char str[]) {
    int count = 0, i = 0, j = 0;
    char uniqueToken[99][999];
    int tokenCount[99] = { 0 };
    while (str[i-1] != '\0') {
        j = 0;
        char token[999];
        while (str[i] != ' ' && str[i] != '\0') {
            token[j++] = str[i++];
        }
        if (token[j - 1] == ':' || token[j - 1] == ',' || token[j - 1] == '.' || token[j - 1] == ';' || token[j - 1] == '?' || token[j - 1] == '!') {
            token[j - 1] = '\0';
        }
        //null 
        token[j] = '\0';
        //flag 
        int flag = -1;
        for (j = 0; j < count; j++) {
            if (strcmp(uniqueToken[j], token) == 0) {
                //if flag is valid, then...
                flag = j;
                tokenCount[flag] = token[flag] + 1;
                break;
            }
        }
        if (flag == -1) {
            tokenCount[count] = tokenCount[count] + 1;
            strcpy(uniqueToken[count++], token);
            strcpy(uniqueToken[count], "\0");
        }
        i++;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM