[英]Realization of strtok in C
char* my_strtok (char* s1,const char* s2){
char *res = NULL;
size_t i, j, len1 = mstrlen(s1), len2 = mstrlen(s2);
for(i=0U; i< len1; i++) {
for(j=0U; j<len2; j++) {
if(s1[i] == s2[j]) {
s1[i] = '\0'; res = (s1 + i+ 1);
break;
}
}
}
return res;
}
can you say it is the right realization of strtok? 您能说这是strtok的正确实现吗? Or you can show your realization?
或者您可以展示您的实现?
You need to have a place where you keep the current position of the input-pointer. 您需要在某个地方保留输入指针的当前位置。 Example using
strspn()
and strcspn()
as the means to get the positions of the delimiters: 使用
strspn()
和strcspn()
作为获取定界符位置的方法的示例:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// SOME CHECKS OMMITTED!
// helper for testing, not necessary for strtok()
static char *strduplicator(const char *s)
{
char *dup;
dup = malloc(strlen(s) + 1);
if (dup != NULL) {
strcpy(dup, s);
}
return dup;
}
// thread-safe (sort of) version
char *my_strtok(char *in, const char *delim, char **pos)
{
char *token = NULL;
// if the input is NULL, we assume that this
// function run already and use the new position
// at "pos" instead
if (in == NULL) {
in = *pos;
}
// skip leading delimiter that are left
// there from the last run, if any
in += strspn(in, delim);
// if it is still not the end of the input
if (*in != '\0') {
// start of token is at the current position, set it
token = in;
// skip non-delimiters, that is: find end of token
in += strcspn(in, delim);
// strip of token by setting first delimiter to NUL
// that is: set end of token
if (*in != '\0') {
*in = '\0';
in++;
}
}
// keep current position of input in "pos"
*pos = in;
return token;
}
int main(void)
{
char *in_1 = strduplicator("this,is;the:test-for!strtok.");
char *in_2 = strduplicator("this,is;the:test-for!my_strtok.");
char *position, *token, *s_in1 = in_1, *s_in2 = in_2;
const char *delimiters = ",;.:-!";
token = strtok(in_1, delimiters);
printf("BUILDIN: %s\n", token);
for (;;) {
token = strtok(NULL, delimiters);
if (token == NULL) {
break;
}
printf("BUILDIN: %s\n", token);
}
token = my_strtok(in_2, delimiters, &position);
printf("OWNBUILD: %s\n", token);
for (;;) {
token = my_strtok(NULL, delimiters, &position);
if (token == NULL) {
break;
}
printf("OWNBUILD: %s\n", token);
}
free(s_in1);
free(s_in2);
exit(EXIT_SUCCESS);
}
If you want to have the ordinary char *strtok(char *str, const char *delim);
如果你想拥有普通的
char *strtok(char *str, const char *delim);
you can do eg: 您可以执行例如:
static char *pos;
char *own_strtok(char *in, const char *delim)
{
return my_strtok(in, delim, &pos);
}
The functions str[c]spn()
are quite simple. 函数
str[c]spn()
非常简单。 To quote the man-page of strspn()
引用
strspn()
的手册页
The
strspn()
function returns the number of bytes in the initial segment of s which consist only of bytes from accept .strspn()
函数返回s初始段中的字节数,该段仅包含accept的字节。
size_t my_strspn(const char *s, const char *accept)
{
const char *delim;
size_t size = 0;
// step through the input
while (*s != '\0') {
// step through delimiters and test
for (delim = accept; *delim != '\0'; delim++) {
if (*s == *delim) {
break;
}
}
// we are through all of the delimiters without success,
// terminate
if (*delim == '\0') {
break;
} else {
size++;
}
s++;
}
return size;
}
The inverse function strcspn()
is even simpler. 逆函数
strcspn()
更简单。 To, again, quote from the man-page: 再次引用手册页:
The
strcspn()
function returns the number of bytes in the initial segment of s which are not in the string reject .strcspn()
函数返回s的初始段中不在字符串reject中的字节数。
size_t my_strcspn(const char *s, const char *reject)
{
const char *delim;
size_t size = 0;
// step through the input
while (*s != '\0') {
// step through delimiters and test
for (delim = reject; *delim != '\0'; delim++) {
if (*s == *delim) {
return size;
}
}
size++;
s++;
}
return size;
}
With n the size of the input and k the size of the set of delimiters the time complexity is O(kn). 输入的大小为n ,定界符集合的大小为k时 ,时间复杂度为O(kn)。 In theory the size of k cannot exceed the size of the alphabet of the input and we should be able to assume
k << n
. 从理论上讲, k的大小不能超过输入字母的大小,我们应该能够假定
k << n
。 But that assumes that the string containing the delimiters is unique. 但这假定包含定界符的字符串是唯一的。 That is not always the case.
并非总是如此。
strtok(
"This is a sentence without the last letter of the alphabet.",
"zzz/* 1,000,000,000 other z's omitted */zzz"
);
So be careful with auto-generated delimiter sets and add an extra check if that danger is real (eg: with user input). 因此,请谨慎使用自动生成的定界符集,并添加额外的检查(如果存在这种危险)(例如,使用用户输入)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.