简体   繁体   中英

C - How to get the string of the outer token in a nested strtok_s

text block:

00000001,otherPerson,0134333334,anotherDepartment
00000002,anotherPerson,01287665478,newDepartment
00000003,someoneElse,0139487632,otherDepartment
00000004,wholeNewPerson,01786666317,aDeparment
00000005,aPerson,013293842,otherDepartment
00000006,oldPerson,0133937333,anotherDepartment

I am trying to process a block of text data by checking if a column in a row is equal to a value and then get the complete row. I split the block of text into rows by \n and then split the row into column by , . But in the inner iteration of the text splitting, the outer token is no more complete row. How to keep the token complete?

char *sav1 = NULL;
char *token = strtok_s(copyOfRecords, "\n", &sav1);
int counter = 0;

while (token != NULL) {
    char *sav2 = NULL;
    char *innerToken = strtok_s(token, ",", &sav2);
    int counter = 1;

    while (innerToken != NULL) {
        // the variable, "token" is not complete anymore in this block
        // How to keep the outer token complete?

        innerToken = strtok_s(NULL, ",", &sav2);
    }

    token = strtok_s(NULL, "\n", &sav1);
}

The strtok family of functions modify the string they seem to extract tokens from. This is confusing and often counter-productive as you experience.

Further confusion comes from the semantics of this tokenisation process, also often misunderstood: for example strtok(token, ",") will interpret any number of consecutive commas as a single separator, which means it cannot handle empty comma separated fields. strtok_s() , which is a Microsoft extension not always available on non Microsoft systems, behaves the same way. Consider not using these functions at all.

You should instead use strcspn() to skip lines and columns and reach your target cell, test its size and contents and return a pointer to the row if there is a match or NULL if there is no match. This way you can restart a search from the last match.

#include <stdio.h>
#include <string.h>

char *select_row(const char *data, int col, const char *value) {
    size_t value_len = strlen(value);
    while (*data != '\0') {
        const char *p = data;
        size_t row_len = strcspn(p, "\n"); // count the number of characters different than newline
        size_t data_len;
        for (int i = 0; i < col; i++) {
            p += strcspn(p, ",\n");  // skip the cell contents
            if (*p == ',') {
                p++;  // skip the comma to point to the next cell
            }
        }
        // p points to the column data
        size_t cell_len = strcspn(p, ",\n");  // compute the cell contents' length
        if (cell_len == len && memcmp(p, value, len) == 0) {
            // if there is a match, return a pointer to the beginning of the row.
            // beware that this is not a token as the data was not modified so
            // the row data stops at the newline but the string goes to the end of
            // the database.
            // you can return an allocated copy of the row with
            // return strndup(data, data_len);
            return (char *)data;
        }
        data += data_len;  // skip the row contents
        if (*data == '\n') {
            data++;   // skip the newline to point to the next row.
        }
    }
    return NULL;
}

int main() {
    const char *data = "00000001,otherPerson,0134333334,anotherDepartment\n"
                       "00000002,anotherPerson,01287665478,newDepartment\n"
                       "00000003,someoneElse,0139487632,otherDepartment\n"
                       "00000004,wholeNewPerson,01786666317,aDeparment\n"
                       "00000005,aPerson,013293842,otherDepartment\n"
                       "00000006,oldPerson,0133937333,anotherDepartment\n";

    const char *found = select_row(data, 1, "anotherPerson");
    int length = strcspn(found, "\n");
    if (found) {
        printf("%.*s\n", length, data);
    }
    return 0;
}

The simplest way to handle this is to make a copy of the outer token.

while (token != NULL) {
    char *token_orig = _strdup(token);
    char *innerToken = strtok_s(token, ",", &sav2);
    int counter = 1;

    while (innerToken != NULL) {
        // Use token_orig

        innerToken = strtok_s(NULL, ",", &sav2);
    }

    token = strtok_s(NULL, "\n", &sav1);
    free(token_orig);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM