简体   繁体   English

如何用空格分隔字符串并保存分隔的单词?

[英]How to separate a string by spaces and save the separated words regardless?

I have a program that separates words by spaces.我有一个用空格分隔单词的程序。 I need to count which words in the text contain 4 different vowels.我需要计算文本中哪些单词包含 4 个不同的元音。 It didn't seem complicated, until I realized that I know how to save the divided words of the function ( strtok ) that returns the separated words, and of course, I can't count a single word because the function only prints.它看起来并不复杂,直到我意识到我知道如何保存返回分隔词的函数( strtok )的分割词,当然,我不能计算单个词,因为该函数只打印。 I don't know how to save the word individually to calculate how many vowels it contains and continue with the other words, one by one.我不知道如何单独保存单词以计算它包含多少个元音并继续使用其他单词,一个接一个。

#include <stdio.h>
#include <string.h>

#define MAX 100

int main() {
    char phase[MAX];
    char temp [50];
    char delim[] = " ";
    
    printf("Ingrese un texto corto: ");
    gets(phase); //Short text with spaces
    printf("\n");
    
    for (i = 0; x < phase[x] != '\0'; ++x) {
        if (phase[x] == ' ' || phase[x] == '\t' || phase[x] == '\v') {
            //Detect space.
        }
    }
    
    char *ptr = strtok(phase, delim);
    
    while (ptr != NULL) {
        printf("%s\n", ptr); //I need to keep all the words separate.
        
        ptr = strtok(NULL, delim);
    }

    return 0;
}

Result:结果:

Ingrese un texto corto: Ana come en el refrigerador.

Ana
come
en
el
refrigerador.

I believe this code will solve the task.我相信这段代码会解决这个任务。 It's tempting to use strtok() again to search for vowels, but that would erase any information about word bounds from strtok 's internal memory.再次使用strtok()来搜索元音是很有诱惑力的,但这会从strtok的内部存储器中删除有关单词边界的任何信息。 So, instead, use strpbrk() ( docs and more docs )因此,请改为使用strpbrk()docs更多 docs

#include <stdio.h>
#include <string.h>

#define MAX 100

int main() {
    char text[MAX];

    // all isspace() characters for "C" locale
    char delimiters[] = " \t\n\v\f\r";

    char vowels[] = "aeiou";

    printf("Input a string of text: ");
    // replaced unsafe gets() with more safe fgets()
    fgets(text, sizeof(text), stdin);

    char* word = strtok(text, delimiters);
    while (word != NULL) {
        // strpbrk() does roughly the same thing as strtok(), but it doesn't
        // modify a string nor does it remember anything on future calls
        char* ptr_to_vowel = word;
        int count = 0;
        while (NULL != (ptr_to_vowel = strpbrk(ptr_to_vowel, vowels))) {
            count++;

            // otherwise we'd stay on the same vowel
            ptr_to_vowel++;
        }
        printf("found %d vowels in %s\n", count, word);
        word = strtok(NULL, delimiters);
    }
    return 0;
}

A few issues:几个问题:

  1. The for loop for counting spaces is incorrect.用于计算空格的for循环不正确。
  2. We should have a separate function to count vowels.我们应该有一个单独的函数来计算元音。
  3. Never use gets (use fgets instead).永远不要使用gets (改用fgets )。
  4. The code did not preserve the original buffer as the code comments suggested.代码没有按照代码注释的建议保留原始缓冲区。

I need to count which words in the text contain 4 different vowels.我需要计算文本中哪些单词包含 4 个不同的元音。

So, we can only count unique vowels in a word (eg):因此,我们只能计算一个单词中的唯一元音(例如):

  1. fleece has only 1 unique vowel and not 3. fleece只有1 个独特的元音,而不是3 个。
  2. great has 2 vowels great有2个元音
  3. greet has 1 vowel greet有 1 个元音
  4. incombustible has 4 [unique] vowels and not 5. incombustible4 个[唯一] 元音,而不是5 个。

It's not totally clear, but I interpret this to mean that a candidate word has at least 4 unique vowels (ie it could have 5)这并不完全清楚,但我将其解释为一个候选词至少有4 个独特的元音(即它可能有 5 个)


I had to refactor quite a bit of the code.我不得不重构相当多的代码。 It is annotated:注释如下:

#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <ctype.h>

#define MAX 100

// vcount -- count _unique_ vowels
// RETURNS: number of _unique_ vowels
int
vcount(const char *str)
{
    const char *vowels = "aeiou";
    int vfreq[5] = { 0 };
    int vsum = 0;

    // loop through all chars in string
    for (int chr = *str++;  chr != 0;  chr = *str++) {
        // get lower case
        chr = tolower((unsigned char) chr);

        // is it a vowel?
        const char *vptr = strchr(vowels,chr);
        if (vptr == NULL)
            continue;

        // get index into frequency table
        ptrdiff_t vidx = vptr - vowels;

        // have we seen it before?
        if (vfreq[vidx])
            continue;

        // mark as already seen
        vfreq[vidx] = 1;

        // count new unique vowel
        ++vsum;
    }

    return vsum;
}

int
main(void)
{
    char phrase[MAX];
    char temp[MAX];
    const char *delim = " \t\v";

    printf("Ingrese un texto corto: ");
    // Short text with spaces
// NOTE/BUG: _never_ use gets
#if 0
    gets(phrase);
#else
    fgets(phrase,sizeof(phrase),stdin);
#endif
    printf("\n");

// NOTE/BUG: loop condition is incorrect
#if 0
    for (i = 0; x < phrase[x] != '\0'; ++x) {
        if (phrase[x] == ' ' || phrase[x] == '\t' || phrase[x] == '\v') {
            // Detect space.
        }
    }
#else
    int space_count = 0;
    for (int i = 0; phrase[i] != '\0'; ++i) {
        switch (phrase[i]) {
        case ' ':
        case '\t':
        case '\v':
            ++space_count;
            break;
        }
    }
    printf("Spaces: %d\n",space_count);
#endif

    // I need to keep all the words separate.
#if 0
    char *ptr = strtok(phrase, delim);
#else
    strcpy(temp,phrase);
    char *ptr = strtok(temp, delim);
#endif

    while (ptr != NULL) {
#if 0
        printf("%s\n", ptr);
#else
        printf("%s -- has enough vowels: %s\n",
            ptr,(vcount(ptr) >= 4) ? "Yes" : "No");
#endif

        ptr = strtok(NULL, delim);
    }

    return 0;
}

In the above code, I used cpp conditionals to denote old vs new code:在上面的代码中,我使用cpp条件来表示旧代码和新代码:

#if 0
// old code
#else
// new code
#endif

#if 1
// new code
#endif

For the input:对于输入:

Ana come en el refrigerador.

Here is the program output:这是程序输出:

Ingrese un texto corto:
Spaces: 4
Ana -- has enough vowels: No
come -- has enough vowels: No
en -- has enough vowels: No
el -- has enough vowels: No
refrigerador. -- has enough vowels: Yes

Instead of splitting words and then count vowels inside them, you can count vowels and upon a delimiter, check the vowel count and reset the count.您可以计算元音并在分隔符上检查元音计数并重置计数,而不是拆分单词然后计算其中的元音。 This way, you do not need to modify the string.这样,您不需要修改字符串。 You actually do not even need an array to store the string: you can just read one byte at a time.实际上,您甚至不需要一个数组来存储字符串:一次只需读取一个字节。

Here is a modified version:这是修改后的版本:

#include <stdio.h>

int main() {
    int mask = 0;
    int vowels = 0;
    int matches = 0;
    int done = 0;
    
    printf("Ingrese un texto: ");
    while (!done) {
        switch (getchar()) {
          case EOF:
          case '\n':
            if (vowels >= 4)
                matches++;
            vowels = mask = 0;
            done = 1;
            break;
          case ' ':
          case '\t':
          case '\n':
          case '\v':
          case '\f':
          case '\r':
            if (vowels >= 4)
                matches++;
            vowels = mask = 0;
            break;
          case 'a':
          case 'A':
            if (!(mask & 1)) {
                mask |= 1;
                vowels++;
            }
            break;
          case 'e':
          case 'E':
            if (!(mask & 2)) {
                mask |= 2;
                vowels++;
            }
            break;
          case 'i':
          case 'I':
            if (!(mask & 4)) {
                mask |= 4;
                vowels++;
            }
            break;
          case 'o':
          case 'O':
            if (!(mask & 8)) {
                mask |= 8;
                vowels++;
            }
            break;
          case 'u':
          case 'U':
            if (!(mask & 16)) {
                mask |= 16;
                vowels++;
            }
            break;
        }
    }
    printf("%d\n", matches);
    return 0;
}

Here is an alternative without a switch statement:这是没有switch语句的替代方法:

#include <ctype.h>
#include <stdio.h>
#include <string.h>

int main() {
    int mask = 0;
    int vowels = 0;
    int matches = 0;
    const char *vowels = "aAeEiIoOuU";
    
    printf("Ingrese un texto: ");
    while (!done) {
        int c = getchar();
        if (c == EOF || isspace(c)) {
            if (vowels >= 4)
                matches++;
            vowels = mask = 0;
            if (c == '\n' || c == EOF)
                done = 1;
        } else {
            char *p = memchr(vowels, c, 10);
            if (p != NULL) {
                int bit = 1 << ((p - vowels) / 2);
                if (!(mask & bit)) {
                    mask |= bit;
                    vowels++;
                }
            }
        }
    }
    printf("%d\n", matches);
    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM