简体   繁体   English

如何在嵌套的C结构中使用灵活的数组成员?

[英]How to use flexible array members in nested C structs?

Related: flexible array member in a nested struct 相关: 嵌套结构中的灵活数组成员

I am trying to parse some data into a struct. 我正在尝试将一些数据解析为一个结构。 The data contains information organized as follows: 数据包含如下组织的信息:

struct unit {

    struct unit_A {
        // 28 bytes each

        // dependency r6scA 1
        char dr6scA1_tagclass[4];
        uint32_t dr6scA1_tagnamepointer;
        uint32_t dr6scA1_tagnamestringlength;
        uint32_t dr6scA1_tagid;

        // 12 bytes of 0x00

    }A;

    // A strings

    struct unit_B {
        // 48 bytes each

        // dependency r6scB 1
        char dr6scB1_tagclass[4];
        uint32_t dr6scB1_tagnamepointer;
        uint32_t dr6scB1_tagnamestringlength;
        uint32_t dr6scB1_tagid;

        // 32 bytes of 0x00

    }B;

    // B strings

    // unit strings

}unit_container;

You can ignore the weird nomenclature. 您可以忽略怪异的命名法。

My line comments // A strings , // B strings and // unit strings each contain null-terminated C strings, the numbers of which coincides with however many unit_A , unit_B , and unit struct entries there are in the data. 我的行注释// A strings// B strings// unit strings的每个包含空终止C字符串,其中与然而许多一致的号码unit_Aunit_B ,和unit结构条目中存在的数据。 So like if there are 5 entries of A in unit_container , then there would be 5 C strings in the location where it says // A strings . 因此,如果在unit_container有5个A条目,则在它表示// A strings的位置中将有5个C // A strings

Since I cannot use flexible array members at these locations, how should I interpret what are essentially an unknown number of variable-length C strings at these locations in the data? 由于我不能在这些位置使用灵活的数组成员,因此我应该如何解释在数据的这些位置上本质上未知数量的可变长度C字符串

For example, the data at these locations could be: 例如,这些位置的数据可能是:

"The first entry is here.\\0Second entry\\0Another!\\0Fourth.\\0This 5th entry is the bestest entry evah by any reasonable standards.\\0" “第一个条目在这里。\\ 0第二个条目\\ 0另一个!\\ 0第四。\\ 0按任何合理的标准,此第五个条目是最好的条目。\\ 0”

...which I expect I should interpret as: ...我希望我将其解释为:

char unit_A_strings[]

...but this is not possible. ...但这是不可能的。 What are my options? 我有什么选择?

Thank you for your consideration. 谢谢您的考虑。

EDIT: 编辑:

I think the most attractive option so far is: 我认为到目前为止,最有吸引力的选择是:

char** unit_A_strings; to point to an array of char strings. 指向一个char字符串数组。

If I do: char unit_A_strings[1]; 如果我这样做: char unit_A_strings[1]; to define a char array of fixed size of 1 char, then I must abandon sizeof(unit) and such, or hassle with memory allocation sizes, even though it is most accurate to the kind of data present. 要定义一个固定大小为1个char的char数组,那么我必须放弃sizeof(unit)或类似的方法,否则就麻烦了内存分配大小,即使它对当前数据的类型最准确。 The same situation occurs if I do char * unit_A_strings[1]; 如果我执行char * unit_A_strings[1];则会发生相同的情况char * unit_A_strings[1]; .

Another question: What would be the difference between using char *unit_A_strings; 另一个问题:使用char *unit_A_strings;什么区别char *unit_A_strings; and char** unit_A_strings; char** unit_A_strings; ?

Conclusion: 结论:

The main problem is that structs are intended for fixed-size information and what I am needing is a variable-sized information memory region. 主要问题是结构旨在用于固定大小的信息,而我需要的是可变大小的信息存储区域。 So I can't legitimately store the data into the struct -- at least not as the struct. 因此,我无法合法地将数据存储到结构中-至少不能将其作为结构存储。 This means that any other interpretation would be alright, and it seems to me that char** is the best available option for this struct situation. 这意味着任何其他解释都可以,并且在我看来char**是针对这种结构情况的最佳可用选择。

I think it can using the char** instead (Or you can write some structure to wrapper it). 我认为它可以使用char **代替(或者您可以编写一些结构对其进行包装)。 for example, you can write a help function to decode you stream. 例如,您可以编写帮助功能以对流进行解码。

char** decodeMyStream(uint_8* stream, unsigned int* numberOfCString)
{
    *numberOfCString = decodeNumberOfCString(stream);
    char** cstrings = malloc((*numberOfCString) * sizeof(char*));
    unsigned int start = 0;
    for (unsigned int i = 0; i < *numberOfCString; ++i)
    {
        usigned int len = calculateIthStringLength(stream, start)
        cstrings[i] = malloc((len) * sizeof(char));
        memcpy(cstrings[i], stream + start, len); 
        start += len
    }
    return cstrings;
}

it just no thinking example code, you can think out more better algorithms. 它只是没有思考的示例代码,您可以想出更多更好的算法。

I think the closest you're going to get is by providing an array of strings: 我认为最接近的是通过提供字符串数组:

char *AStrings[] = { "The first entry is here.",
                     "Second entry",
                     "Another!",
                     "Fourth.",
                     "This 5th entry is the bestest entry evah by any reasonable standards.",
                     NULL
                   };

Note two things: 注意两件事:

  1. AStrings is an array of pointers-to-strings - it will be 6 (see 2. below) consecutive pointers that point to the actual strings, NOT the 'compound' string you used in your example. AStrings是一个指向字符串的指针的数组-它将是6个(请参见下面的2.)连续指针,它们指向实际字符串,而不是您在示例中使用的“ compound”字符串。
  2. I ended AStrings with a NULL pointer, to resolve the "when do I finish?" 我用NULL指针结束了AStrings ,以解决“何时完成?” question. 题。

So you can "fall off the end" of A and start looking at locations as pointers - but be careful! 因此,您可以“掉到A的末端”并开始将位置视为指针-但要小心! The compiler may put in all sorts of padding between one variable and the next, mucking up any assumptions about where they are relative to each other in memory - including reordering them! 编译器可能在一个变量和下一个变量之间进行各种填充,从而掩盖了它们在内存中相对位置的所有假设,包括对其重新排序!

Edit Oh! 编辑哦! I just had a thought. 我只是有一个想法。 Another data representation that may help is essentially what you did. 从本质上讲,另一种可能有用的数据表示形式是您所做的。 I've 'prettied' it up a bit: 我对它进行了“点缀”:

char AString[] = "The first entry is here.\0"
                 "Second entry\0"
                 "Another!\0"
                 "Fourth.\0"
                 "This 5th entry is the bestest entry evah by any reasonable standards.\0";
  • The C compiler will automatically concatenate two 'adjacent' strings as though they were one string - with no NUL character between them. C编译器将自动连接两个“相邻”字符串,就好像它们是一个字符串一样,它们之间没有 NUL字符。 I put them in specifically above. 我把它们放在上面。
  • The C compiler will automatically put a '\\0' at the end of any string - at the semicolon ( ; ) in the above example. C编译器会自动在任何字符串的末尾添加一个'\\0' 0'-在上例中的分号( ; )处。 That means that the string actually ends with two NUL characters, not one. 这意味着该字符串实际上以两个NUL字符结尾,而不是一个。

You can use that fact to keep track of where you are while parsing the string 'array' - assuming that every desired value has a (sub)string of more than zero length! 您可以在解析字符串“数组”时使用该事实来跟踪您的位置-假设每个期望值的(子)字符串长度都超过零! As soon as you encounter a zero-length (sub)string, you know you've reached the end of the string 'array'. 一旦遇到零长度(子)字符串,您就知道已经到达字符串“数组”的末尾。

I call these kind of strings ASCIIZZ strings (ASCIIZ strings with a second NUL at the end of all of them). 我将这类字符串称为ASCIIZZ字符串(所有结尾都带有第二个NUL的ASCIIZ字符串)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM