繁体   English   中英

C - 如何在嵌套的 strtok_s 中获取外部令牌的字符串

[英]C - How to get the string of the outer token in a nested strtok_s

文本块:

00000001,otherPerson,0134333334,anotherDepartment
00000002,anotherPerson,01287665478,newDepartment
00000003,someoneElse,0139487632,otherDepartment
00000004,wholeNewPerson,01786666317,aDeparment
00000005,aPerson,013293842,otherDepartment
00000006,oldPerson,0133937333,anotherDepartment

我试图通过检查一行中的列是否等于一个值来处理一个文本数据块,然后获取完整的行。 我将文本块按\n拆分为行,然后按,将行拆分为列。 但是在文本拆分的内部迭代中,外部标记不再是完整的行。 如何保持令牌完整?

char *sav1 = NULL;
char *token = strtok_s(copyOfRecords, "\n", &sav1);
int counter = 0;

while (token != NULL) {
    char *sav2 = NULL;
    char *innerToken = strtok_s(token, ",", &sav2);
    int counter = 1;

    while (innerToken != NULL) {
        // the variable, "token" is not complete anymore in this block
        // How to keep the outer token complete?

        innerToken = strtok_s(NULL, ",", &sav2);
    }

    token = strtok_s(NULL, "\n", &sav1);
}

strtok系列函数修改它们似乎从中提取标记的字符串。 正如您所经历的那样,这令人困惑并且经常适得其反。

进一步的混淆来自这个标记化过程的语义,也经常被误解:例如strtok(token, ",")会将任意数量的连续逗号解释为单个分隔符,这意味着它不能处理空的逗号分隔字段。 strtok_s()是一个在非 Microsoft 系统上并不总是可用的 Microsoft 扩展,其行为方式相同。 考虑根本不使用这些功能。

您应该使用strcspn()跳过行和列并到达目标单元格,测试其大小和内容,如果匹配则返回指向行的指针,如果不匹配则返回NULL 这样,您可以从最后一个匹配项重新开始搜索。

#include <stdio.h>
#include <string.h>

char *select_row(const char *data, int col, const char *value) {
    size_t value_len = strlen(value);
    while (*data != '\0') {
        const char *p = data;
        size_t row_len = strcspn(p, "\n"); // count the number of characters different than newline
        size_t data_len;
        for (int i = 0; i < col; i++) {
            p += strcspn(p, ",\n");  // skip the cell contents
            if (*p == ',') {
                p++;  // skip the comma to point to the next cell
            }
        }
        // p points to the column data
        size_t cell_len = strcspn(p, ",\n");  // compute the cell contents' length
        if (cell_len == len && memcmp(p, value, len) == 0) {
            // if there is a match, return a pointer to the beginning of the row.
            // beware that this is not a token as the data was not modified so
            // the row data stops at the newline but the string goes to the end of
            // the database.
            // you can return an allocated copy of the row with
            // return strndup(data, data_len);
            return (char *)data;
        }
        data += data_len;  // skip the row contents
        if (*data == '\n') {
            data++;   // skip the newline to point to the next row.
        }
    }
    return NULL;
}

int main() {
    const char *data = "00000001,otherPerson,0134333334,anotherDepartment\n"
                       "00000002,anotherPerson,01287665478,newDepartment\n"
                       "00000003,someoneElse,0139487632,otherDepartment\n"
                       "00000004,wholeNewPerson,01786666317,aDeparment\n"
                       "00000005,aPerson,013293842,otherDepartment\n"
                       "00000006,oldPerson,0133937333,anotherDepartment\n";

    const char *found = select_row(data, 1, "anotherPerson");
    int length = strcspn(found, "\n");
    if (found) {
        printf("%.*s\n", length, data);
    }
    return 0;
}

处理此问题的最简单方法是制作外部令牌的副本。

while (token != NULL) {
    char *token_orig = _strdup(token);
    char *innerToken = strtok_s(token, ",", &sav2);
    int counter = 1;

    while (innerToken != NULL) {
        // Use token_orig

        innerToken = strtok_s(NULL, ",", &sav2);
    }

    token = strtok_s(NULL, "\n", &sav1);
    free(token_orig);
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM