简体   繁体   English

C 编程在随机字符串中查找所有单词

[英]C Programming find all the words in a random string

Say I have a very random string such as:假设我有一个非常随机的字符串,例如:

"%^&%thank*(^  ^&* you&*^^guys"

What is the most efficient way to find all the words in the string?查找字符串中所有单词的最有效方法是什么? Without checking the string character by character?不逐个字符检查字符串?

Here I wrote how I would have done this due to request在这里,我写了由于请求我将如何做到这一点

int length(char *c) {
    int n = 0;

    while(*(c+n)){
        n++;
    }
    return n;
}

int main(int argc, char *argv[]) { 
    int n;
    int m=0;
    int count=1;

    if(argv[1]==NULL) {
        printf("%s","error" );
    }

    while(argv[count]!=NULL){
        n=length(argv[count]);
        while(m!=n){
            if('a'<argv[count][m]<'z'){
                //do stuff
            }
        }
        count++;
    }

    return 0;
}

You can just use strtok(3) to parse the string at multiple delimiters.您可以仅使用strtok(3)来解析多个分隔符处的字符串。 In terms of making this work for random strings, you might need to have a collection of all the possible delimiters that could occur.为了使这项工作适用于随机字符串,您可能需要收集所有可能出现的分隔符。 Here is a very basic example of using strtok() :这是使用strtok()一个非常基本的示例:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    char str[] = "%^&%thank*(^  ^&* you&*^^guys";
    const char *delim = "%^*(& ";

    char *word = strtok(str, delim);
    while (word != NULL) {
        printf("%s\n", word);
        word = strtok(NULL, delim);
    }

    return 0;
}

UPDATE:更新:

Here is a more useful method, which collects the delimiters from str :这是一个更有用的方法,它从str收集分隔符:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAXCHAR 256

int main(void) {
    char str[] = "%^&%thank*(^  ^&* you&*^^guys";
    int count[MAXCHAR] = {0};
    char *word;
    unsigned char curr;
    size_t charcount = 0, numbytes = strlen(str);
    char delim[numbytes+1];

    for (size_t i = 0; str[i] != '\0'; i++) {
        curr = str[i];
        if (!isalpha(str[i]) && count[curr] == 0) {
            delim[charcount++] = str[i];
            count[curr] = 1;
        }
    }
    delim[charcount] = '\0';

    word = strtok(str, delim);
    while (word != NULL) {
        printf("%s\n", word);
        word = strtok(NULL, delim);
    }

    return 0;
}

This solution uses a hashing O(n) approach for only adding unique delimeters.此解决方案使用散列O(n)方法仅添加唯一的分隔符。 This is a possible solution, but the approach of going through character by character is more efficient.这是一个可能的解决方案,但逐个字符遍历的方法更有效。 This is because all you need is a temporary buffer to store the current word being processed, and once a non alpha character is seen, terminate the buffer and start again.这是因为您只需要一个临时缓冲区来存储正在处理的当前单词,一旦看到非字母字符,终止缓冲区并重新开始。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM