[英]C Programming find all the words in a random string
Say I have a very random string such as:假设我有一个非常随机的字符串,例如:
"%^&%thank*(^ ^&* you&*^^guys"
What is the most efficient way to find all the words in the string?查找字符串中所有单词的最有效方法是什么? Without checking the string character by character?
不逐个字符检查字符串?
Here I wrote how I would have done this due to request在这里,我写了由于请求我将如何做到这一点
int length(char *c) {
int n = 0;
while(*(c+n)){
n++;
}
return n;
}
int main(int argc, char *argv[]) {
int n;
int m=0;
int count=1;
if(argv[1]==NULL) {
printf("%s","error" );
}
while(argv[count]!=NULL){
n=length(argv[count]);
while(m!=n){
if('a'<argv[count][m]<'z'){
//do stuff
}
}
count++;
}
return 0;
}
You can just use strtok(3)
to parse the string at multiple delimiters.您可以仅使用
strtok(3)
来解析多个分隔符处的字符串。 In terms of making this work for random strings, you might need to have a collection of all the possible delimiters that could occur.为了使这项工作适用于随机字符串,您可能需要收集所有可能出现的分隔符。 Here is a very basic example of using
strtok()
:这是使用
strtok()
一个非常基本的示例:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char str[] = "%^&%thank*(^ ^&* you&*^^guys";
const char *delim = "%^*(& ";
char *word = strtok(str, delim);
while (word != NULL) {
printf("%s\n", word);
word = strtok(NULL, delim);
}
return 0;
}
UPDATE:更新:
Here is a more useful method, which collects the delimiters from str
:这是一个更有用的方法,它从
str
收集分隔符:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define MAXCHAR 256
int main(void) {
char str[] = "%^&%thank*(^ ^&* you&*^^guys";
int count[MAXCHAR] = {0};
char *word;
unsigned char curr;
size_t charcount = 0, numbytes = strlen(str);
char delim[numbytes+1];
for (size_t i = 0; str[i] != '\0'; i++) {
curr = str[i];
if (!isalpha(str[i]) && count[curr] == 0) {
delim[charcount++] = str[i];
count[curr] = 1;
}
}
delim[charcount] = '\0';
word = strtok(str, delim);
while (word != NULL) {
printf("%s\n", word);
word = strtok(NULL, delim);
}
return 0;
}
This solution uses a hashing O(n)
approach for only adding unique delimeters.此解决方案使用散列
O(n)
方法仅添加唯一的分隔符。 This is a possible solution, but the approach of going through character by character is more efficient.这是一个可能的解决方案,但逐个字符遍历的方法更有效。 This is because all you need is a temporary buffer to store the current word being processed, and once a non alpha character is seen, terminate the buffer and start again.
这是因为您只需要一个临时缓冲区来存储正在处理的当前单词,一旦看到非字母字符,终止缓冲区并重新开始。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.