简体   繁体   English

在 C 中使用不同的分隔符拆分字符串

[英]splitting a string up using different delimiters in C

I am trying to split up a string by using different delimiters.我正在尝试使用不同的定界符拆分字符串。 After hours of trial and error using strtok() , I have finally got a way to make it work.在使用strtok()进行数小时的反复试验后,我终于找到了让它工作的方法。 However it uses NULLs in the place of given strings in strtok, and I dont fully understand how it works.但是它在 strtok 中使用 NULL 代替给定的字符串,我不完全理解它是如何工作的。

I have tried to split it up so I can save it in separate variables so i can use them to return functions within my main function, but it doesnt work, which leads me to believe it is incredibly flimsy way of splitting the string up.我试图将它拆分,以便将它保存在单独的变量中,这样我就可以使用它们在我的主 function 中返回函数,但它不起作用,这让我相信这是一种非常脆弱的拆分字符串的方法。

the input string is read from a config file and is in this format:输入字符串是从配置文件中读取的,格式如下:

(6,2) SLUG 1 0 EAST

the current code i'm using is this:我正在使用的当前代码是这样的:

void createSlug(char* data) {
        int slugPosX, slugPosY, slugAge;
        char *slugDir;
        char *token1;
        char *token2;

        slugPosX = atoi(strtok(data, "("));

        token1 = strtok(data, ",");
        slugPosY = atoi(strtok(strtok(NULL, ","), ")"));

        token2 = strtok(strtok(NULL, ","), ")");
        slugAge = atoi(strtok(token2, " SLUG "));

        slugDir = strtok(NULL, " 0 ");

        printf("slug position is: (%d,%d), with age %d, and direction: %s", slugPosX, slugPosY, slugAge, slugDir);

}

the output would be printed as: output 将打印为:

slug position is: (6,2), with age 1, and direction: EAST

The input file changes but is always in the above format.输入文件发生变化,但始终采用上述格式。 It is also worth mentioning that the '0' in the input string is always 0, so I ignored that part of it, as I could use it as the delimiter.还值得一提的是,输入字符串中的“0”始终为 0,所以我忽略了它的那部分,因为我可以将它用作分隔符。

Is there an easier way of doing this?有更简单的方法吗? I'm very very new to C so any help would be greatly appreciated我对 C 非常陌生,非常感谢任何帮助

The input file changes but is always in the above format输入文件发生变化但始终为上述格式

Scanf?扫描?

int main() {
    int w, x,y,z;
    char a[60] = "", b[60] = "";
    printf("%d\n", sscanf("(6,2) SLUG 1 0 EAST", "(%d,%d) %s %d%d%s", &w,&x,a,&y,&z,b));
    printf("w = %d, x = %d, y = %d, z = %d, a = '%s', b = '%s'\n" ,w,x,y,z,a,b);
}

https://godbolt.org/z/Woa94sb1Y https://godbolt.org/z/Woa94sb1Y

It's a poor craftsman who blames his tools.怨其器,乃穷匠。

To use strtok() it is probably easiest to only invoke it in one place.要使用strtok() ,可能最简单的方法是只在一个地方调用它。 The function should "know" the record layout and can be written to "segment" the string to suit. function 应该“知道”记录布局并且可以写入以“分段”字符串以适合。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void createSlug( char* data ) {
    enum { ePosx, ePosy, eName, eAge, eXXX, eDir };

    // ALWAYS initialise variables to avoid sporadic functioning and possible UB
    int slugPosX = 0, slugPosY = 0, slugAge = 0;
    char *slugDir = "";

    int stage = 0;
    for( char *cp = data; ( cp = strtok( cp, "(,) \n" ) ) != NULL; cp = NULL )
        switch( stage++ ) {
            case ePosx: slugPosX = atoi( cp ); break;
            case ePosy: slugPosY = atoi( cp ); break;
            case eName: break;
            case eAge:  slugAge = atoi( cp ); break;
            case eXXX:  break;
            case eDir:  slugDir = cp; break;
            default:
                puts( "Extra fields!" );
                break;
        }

    printf("slug position is: (%d,%d), with age %d, and direction: %s", slugPosX, slugPosY, slugAge, slugDir);

}

int main( void ) {
    char str[] = "(6,2) SLUG 1 0 EAST\n";

    createSlug( str );

    return 0;
}
slug position is: (6,2), with age 1, and direction: EAST

Still using atoi() here, but strtol() may be a better translation function.这里还是用atoi() ,不过strtol()可能是更好的翻译function。

If you are sure that the characters '(' , ',' , ')' and ' ' are only used as delimiters and don't ever occur in the tokens, then you can simply use "(,) " as the delimiter string in all calls to strtok :如果您确定字符'(' , ',' , ')'' '仅用作分隔符并且永远不会出现在标记中,那么您可以简单地使用"(,) "作为分隔符字符串在对strtok的所有调用中:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void createSlug( char* data )
{
    int slugPosX, slugPosY, slugAge;
    char *slugDir;
    const char *delim = "(,) ";

    slugPosX = atoi( strtok(data, delim) );
    slugPosY = atoi( strtok(NULL, delim) );

    //ignore the "SLUG" token
    strtok( NULL, delim );

    slugAge = atoi( strtok(NULL, delim) );

    //ignore the "0" token
    strtok( NULL, delim );

    slugDir = strtok( NULL, delim );

    printf( "slug position is: (%d,%d), with age %d, and direction: %s", slugPosX, slugPosY, slugAge, slugDir );
}

int main( void )
{
    char str[] = "(6,2) SLUG 1 0 EAST";

    createSlug( str );
}

However, this program may crash if strtok ever returns NULL due to the input not being in the expected format.但是,如果strtok由于输入不是预期格式而返回NULL ,此程序可能会崩溃。 Here is a different version which does a lot more input validation and prints an error message instead of crashing:这是一个不同的版本,它执行更多的输入验证并打印错误消息而不是崩溃:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void createSlug( char* data )
{
    int slugPosX, slugPosY, slugAge;
    char *slugDir;
    char *token, *p;
    const char *delim = "(,) ";

    //attempt to find first token
    token = strtok( data, delim );
    if ( token == NULL )
    {
        fprintf( stderr, "Unable to find first token!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to convert first token to an integer
    slugPosX = strtol( token, &p, 10 );
    if ( *p != '\0' )
    {
        fprintf( stderr, "Unable to convert first token to an integer!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find second token
    token = strtok( NULL, delim );
    if ( token == NULL )
    {
        fprintf( stderr, "Unable to find second token!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to convert second token to an integer
    slugPosY = strtol( token, &p, 10 );
    if ( *p != '\0' )
    {
        fprintf( stderr, "Unable to convert second token to an integer!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find third token
    token = strtok( NULL, delim );
    if ( token == NULL )
    {
        fprintf( stderr, "Unable to find third token!\n" );
        exit( EXIT_FAILURE );
    }

    //verify that third token contains "SLUG"
    if ( strcmp( token, "SLUG" ) != 0 )
    {
        fprintf( stderr, "Invalid content of third token!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find fourth token
    token = strtok( NULL, delim );
    if ( token == NULL )
    {
        fprintf( stderr, "Unable to find fourth token!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to convert fourth token to an integer
    slugAge = strtol( token, &p, 10 );
    if ( *p != '\0' )
    {
        fprintf( stderr, "Unable to convert fourth token to an integer!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find fifth token
    token = strtok( NULL, delim );
    if ( token == NULL )
    {
        fprintf( stderr, "Unable to find fifth token!\n" );
        exit( EXIT_FAILURE );
    }

    //verify that fifth token contains "0"
    if ( strcmp( token, "0" ) != 0 )
    {
        fprintf( stderr, "Invalid content of fifth token!\n" );
        exit( EXIT_FAILURE );
    }

    //attempt to find sixth token
    slugDir = strtok( NULL, delim );
    if ( slugDir == NULL )
    {
        fprintf( stderr, "Unable to find sixth token!\n" );
        exit( EXIT_FAILURE );
    }

    printf( "slug position is: (%d,%d), with age %d, and direction: %s", slugPosX, slugPosY, slugAge, slugDir );
}

int main( void )
{
    char str[] = "(6,2) SLUG 1 0 EAST";

    createSlug( str );
}

However, a significant amount of the code is now duplicated to a certain degree, so the code is not very maintainable.但是,现在有相当多的代码在一定程度上是重复的,因此代码的可维护性不是很好。

Therefore, it may be better to use a more systematic approach, by for example因此,最好使用更系统的方法,例如

  • calling strtok in a loop, and在循环中调用strtok ,并且
  • creating a function for converting tokens to integers, and calling this function multiple times.创建一个 function 用于将令牌转换为整数,并多次调用此 function。

Here is an alternative solution which does this:这是执行此操作的替代解决方案:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>

#define NUM_TOKENS 6

// This function will return true if it was able to convert
// the token to an int, otherwise it will return false.
bool convert_token_to_int( char *str, int *result )
{
    long num;
    char *p;

    num = strtol( str, &p, 10 );

    if ( p == str || *p != '\0' )
    {
        return false;
    }

    *result = num;
    return true;
}

void createSlug( char* data )
{
    char *tokens[NUM_TOKENS];
    int slugPosX, slugPosY, slugAge;
    char *slugDir;
    const char *delim = "(,) ";

    //store all tokens in the array tokens
    tokens[0] = strtok( data, delim );
    for ( int i = 0; ; )
    {
        if ( tokens[i] == NULL )
        {
            fprintf( stderr, "Could not find token #%d!\n", i );
            return;
        }

        //break out of loop after finishing all tokens
        if ( ++i == NUM_TOKENS )
            break;

        //find next token for next loop iteration
        tokens[i] = strtok( NULL, delim );
    }

    //convert the integer tokens
    if (
        ! convert_token_to_int( tokens[0], &slugPosX )
        ||
        ! convert_token_to_int( tokens[1], &slugPosY )
        ||
        ! convert_token_to_int( tokens[3], &slugAge )
    )
    {
        fprintf( stderr, "Error converting tokens to integers!\n" );
        exit( EXIT_FAILURE );
    }

    //verify that non-variable tokens contain the
    //intended values
    if (
        strcmp( tokens[2], "SLUG" ) != 0
        ||
        strcmp( tokens[4], "0" ) != 0
    )
    {
        fprintf( stderr, "Non-variable tokens do not contain the intended values!\n" );
        exit( EXIT_FAILURE );
    }

    //make slugDir point to the appropriate token
    slugDir = tokens[5];

    printf( "slug position is: (%d,%d), with age %d, and direction: %s", slugPosX, slugPosY, slugAge, slugDir );
}

int main( void )
{
    char str[] = "(6,2) SLUG 1 0 EAST";

    createSlug( str );
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM