简体   繁体   English

不修改原始字符串指针的 strtok_r() 和 strsep() 的 C 字符串替代品?

[英]C-string alternatives to strtok_r() and strsep() that don't modify the original string pointer?

I was taking a look at the 2 C-string functions, strtok_r() and strsep(), and noticed both functions modify the location of the original string passed in.我正在查看 2 个 C 字符串函数 strtok_r() 和 strsep(),并注意到这两个函数都修改了传入的原始字符串的位置。

Are there any other C-string functions that don't modify the original string passed in?是否还有其他不修改传入的原始字符串的 C 字符串函数?

In my application, the original string is dynamically allocated, so I wish to free the original string after the parsing is done.在我的应用程序中,原始字符串是动态分配的,所以我希望在解析完成后释放原始字符串。

An example with strtok_r() strtok_r() 的示例

int main(){
    char * str = strdup("Tutorial and example");
    char* token;
    char* rest = str;
    
    printf("%s\n", rest);
    while ((token = strtok_r(rest, " ", &rest)))
        printf("%s\n", token);
    printf("\n%s\n",str);
    return(0);
}

Output Output

Tutorial and example                                                                                                                                                        
Tutorial                                                                                                                                                                    
and                                                                                                                                                                         
example                                                                                                                                                                     
                                                                                                                                                                            
                                                                                                                                                                            
                                                                                                                                                                            
Tutorial                                                                                                                                                                          

In the very last line, I wish for str to point to the unmodified cstring "Tutorial and example".在最后一行,我希望 str 指向未修改的 cstring“教程和示例”。

A similar output would have occured with strsep() as well. strsep() 也会发生类似的 output 。

int main(){
    char * str = strdup("Tutorial and example");
    char* token;
    char* rest = str;

    printf("%s\n", rest); 
    while ((token = strsep(&rest, " ")))
        printf("%s\n", token);
    if (rest != NULL)
        printf("%s\n", rest);
        
    printf("%s\n", str); 
    return(0);
}

Thank you.谢谢你。

I think you are misunderstanding strtok_r .我认为您误解strtok_r It does not change the location of the original string, moreover, it can not - the function can not change the value of the pointer passed into it and make this change visible to the calling code.它不会改变原始字符串的位置,而且它不能——function 不能改变传递给它的指针的值,并使这个改变对调用代码可见。

What it can and will do is modifying the contents of the string itself, by replacing tokens with nul -terminators.它可以并且将会做的是通过用nul终止符替换标记来修改字符串本身的内容 So to answer your original question:所以回答你原来的问题:

In my application, the original string is dynamically allocated, so I wish to free the original string after the parsing is done.在我的应用程序中,原始字符串是动态分配的,所以我希望在解析完成后释放原始字符串。

You do not have to do anything special.你不必做任何特别的事情。 You can and should free original string after you are done with it.完成后,您可以并且应该释放原始字符串。

You are seeing a single word Tutorial printed simply because the next character was replaced with nul -terminator and printf stop there.您看到一个单词Tutorial打印只是因为下一个字符被替换为nul -终止符并且printf停在那里。 If you are to inspect the string character by character, you will see that it otherwise have remained intact.如果您要逐个字符地检查字符串,您会发现它在其他方面保持不变。

Though the mentioned string functions change the original string nevertheless the pointer str points to the dynamically allocated memory and you may use it to free the allocated memory.尽管提到的字符串函数改变了原始字符串,但指针str指向动态分配的 memory,您可以使用它来释放分配的 memory。

if you do not want to change the original string you can use standard C string functions strspn and strcspn .如果您不想更改原始字符串,可以使用标准 C 字符串函数strspnstrcspn

For example例如

#include <stdio.h>
#include <string.h>

int main(void) 
{
    const char *s = "Tutorial and example";
    const char *separator = " \t";
    
    puts( s );
    
    for ( const char *p = s; *p; )
    {
        p += strspn( p, separator );
        
        const char *prev = p;
        
        p += strcspn( p, separator );
        
        int width = p - prev;
        
        if ( width ) printf( "%.*s\n", width, prev );
    }
    
    return 0;
}

The program output is程序 output 是

Tutorial and example
Tutorial
and
example

Using this approach you can dynamically allocate memory for each extracted substring.使用这种方法,您可以为每个提取的 substring 动态分配 memory。

For example例如

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void) 
{
    const char *s = "Tutorial and example";
    const char *separator = " \t";
    
    puts( s );
    
    size_t n = 0;
    char **a = NULL;
    int success = 1;
    
    for ( const char *p = s; success && *p; )
    {
        p += strspn( p, separator );
        
        const char *prev = p;
        
        p += strcspn( p, separator );
        
        if ( p - prev != 0 )
        {
            char *t = malloc( p - prev + 1 );
            
            if ( ( success = t != NULL ) )
            {
                t[p - prev] = '\0';
                memcpy( t, prev, p - prev );
            
                char **tmp = realloc( a, ( n + 1 ) * sizeof( char * ) );
                
                if ( ( success = tmp != NULL ) )
                {
                    a = tmp;
                    a[n++] = t;
                }
                else
                {
                    free( t );
                }
            }
        }
    }
    
    for ( size_t i = 0; i < n; i++)
    {
        puts( a[i] );
    }

    for ( size_t i = 0; i < n; i++)
    {
        free( a[i] );
    }
    
    free( a );
    
    return 0;
}

The program output is the same as shown above.程序 output 与上图相同。

Tutorial and example
Tutorial
and
example

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM