簡體   English   中英

C - 如何在嵌套的 strtok_s 中獲取外部令牌的字符串

[英]C - How to get the string of the outer token in a nested strtok_s

文本塊:

00000001,otherPerson,0134333334,anotherDepartment
00000002,anotherPerson,01287665478,newDepartment
00000003,someoneElse,0139487632,otherDepartment
00000004,wholeNewPerson,01786666317,aDeparment
00000005,aPerson,013293842,otherDepartment
00000006,oldPerson,0133937333,anotherDepartment

我試圖通過檢查一行中的列是否等於一個值來處理一個文本數據塊,然后獲取完整的行。 我將文本塊按\n拆分為行,然后按,將行拆分為列。 但是在文本拆分的內部迭代中,外部標記不再是完整的行。 如何保持令牌完整?

char *sav1 = NULL;
char *token = strtok_s(copyOfRecords, "\n", &sav1);
int counter = 0;

while (token != NULL) {
    char *sav2 = NULL;
    char *innerToken = strtok_s(token, ",", &sav2);
    int counter = 1;

    while (innerToken != NULL) {
        // the variable, "token" is not complete anymore in this block
        // How to keep the outer token complete?

        innerToken = strtok_s(NULL, ",", &sav2);
    }

    token = strtok_s(NULL, "\n", &sav1);
}

strtok系列函數修改它們似乎從中提取標記的字符串。 正如您所經歷的那樣,這令人困惑並且經常適得其反。

進一步的混淆來自這個標記化過程的語義,也經常被誤解:例如strtok(token, ",")會將任意數量的連續逗號解釋為單個分隔符,這意味着它不能處理空的逗號分隔字段。 strtok_s()是一個在非 Microsoft 系統上並不總是可用的 Microsoft 擴展,其行為方式相同。 考慮根本不使用這些功能。

您應該使用strcspn()跳過行和列並到達目標單元格,測試其大小和內容,如果匹配則返回指向行的指針,如果不匹配則返回NULL 這樣,您可以從最后一個匹配項重新開始搜索。

#include <stdio.h>
#include <string.h>

char *select_row(const char *data, int col, const char *value) {
    size_t value_len = strlen(value);
    while (*data != '\0') {
        const char *p = data;
        size_t row_len = strcspn(p, "\n"); // count the number of characters different than newline
        size_t data_len;
        for (int i = 0; i < col; i++) {
            p += strcspn(p, ",\n");  // skip the cell contents
            if (*p == ',') {
                p++;  // skip the comma to point to the next cell
            }
        }
        // p points to the column data
        size_t cell_len = strcspn(p, ",\n");  // compute the cell contents' length
        if (cell_len == len && memcmp(p, value, len) == 0) {
            // if there is a match, return a pointer to the beginning of the row.
            // beware that this is not a token as the data was not modified so
            // the row data stops at the newline but the string goes to the end of
            // the database.
            // you can return an allocated copy of the row with
            // return strndup(data, data_len);
            return (char *)data;
        }
        data += data_len;  // skip the row contents
        if (*data == '\n') {
            data++;   // skip the newline to point to the next row.
        }
    }
    return NULL;
}

int main() {
    const char *data = "00000001,otherPerson,0134333334,anotherDepartment\n"
                       "00000002,anotherPerson,01287665478,newDepartment\n"
                       "00000003,someoneElse,0139487632,otherDepartment\n"
                       "00000004,wholeNewPerson,01786666317,aDeparment\n"
                       "00000005,aPerson,013293842,otherDepartment\n"
                       "00000006,oldPerson,0133937333,anotherDepartment\n";

    const char *found = select_row(data, 1, "anotherPerson");
    int length = strcspn(found, "\n");
    if (found) {
        printf("%.*s\n", length, data);
    }
    return 0;
}

處理此問題的最簡單方法是制作外部令牌的副本。

while (token != NULL) {
    char *token_orig = _strdup(token);
    char *innerToken = strtok_s(token, ",", &sav2);
    int counter = 1;

    while (innerToken != NULL) {
        // Use token_orig

        innerToken = strtok_s(NULL, ",", &sav2);
    }

    token = strtok_s(NULL, "\n", &sav1);
    free(token_orig);
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM