简体   繁体   中英

How can I initialize a flexible array in rodata and create a pointer to it?

In C, the code

char *c = "Hello world!";

stores Hello world!\0 in rodata and initializes c with a pointer to it. How can I do this with something other than a string?

Specifically, I am trying to define my own string type

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

And then want some sort of macro so that I can say

const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

And have it behave the same, in that \x0c\0\0\0Hello world! is stored in rodata and c2 is initialized with a pointer to it.

I tried using

#define PASCAL_STRING_CONSTANT(c_string_constant) \
    &((const PascalString) { \
        .Length=sizeof(c_string_constant)-1, \
        .Data=(c_string_constant), \
    })

as suggested in these questions , but it doesn't work because Data is a flexible array: I get the error error: non-static initialization of a flexible array member (with gcc, clang gives a similar error).

Is this possible in C? And if so, what would the PASCAL_STRING_CONSTANT macro look like?

To clarify

With a C string, the following code-block never stores the string on the stack:

#include <inttypes.h>
#include <stdio.h>

int main(void) {
    const char *c = "Hello world!";

    printf("test %s", c);

    return 0;
}

As we can see by looking at the assembly , line 5 compiles to just loading a pointer into a register.

I want to be able to get that same behavior with pascal strings, and using GNU extensions it is possible to. The following code also never stores the pascal-string on the stack:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

Looking at its generated assembly , line 18 is also just loading a pointer.

However, the best code I've found to do this in ANSI C produces code to copy the entire string onto the stack:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(initial_value) \
    (const PascalString *)&(const struct { \
        uint32_t Length; \
        char Data[sizeof(initial_value)]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

In the generated assembly for this code , line 19 copies the entire struct onto the stack then produces a pointer to it.

I'm looking for either ANSI C code that produces the same assembly as my second example, or an explanation of why that's not possible with ANSI C.

This can be done with the statment-expressions GNU extension, although it is nonstandard.

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

The extension allows you to have multiple statements in a block as an expression which evaluates to the value of the last statement by enclosing the block in ({... }) . Thus, we can declare our PascalString as a static const value, and then return a pointer to it.

For completeness, we can also make a stack buffer if we want to modify it:

#define PASCAL_STRING_STACKBUF(initial_value, capacity) \
    (PascalString *)&(struct { \
        uint32_t Length; \
        char Data[capacity]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }

You can use this macro, which names the name of the variable on its contents:

#define PASCAL_STRING(name, str) \
    struct { \
        unsigned char len; \
        char content[sizeof(str) - 1]; \
    } name = { sizeof(str) - 1, str }

To create such a string. Use it like this:

const PASCAL_STRING(c2, "Hello world!");

The answer is that no, you cannot initialize a flexible array in.rodata and create a pointer to it in plain C.

There are a few reasons for this; as a starting point, standard C doesn't specify a .rodata section. Another reason is that something similar could be implemented almost equivalently with pointers.

There are many solutions to this, including allocating the memory with malloc , using a (somewhat) fixed size for the Data array, or using statement expressions, but you have ruled these out (as they don't store the result in .rodata (aka they store it in the stack) or they use GNU extensions). Therefore, no portable solution will be able to do exactly what you want.

The C standard specifies that you can't initialize a flexible array member in ISO/IEC 9899:1999 section 6.7.2.1 point 18:

 struct s { int n; double d[]; };

[...]

 struct s t2 = { 1, { 4.2 }}; // invalid

[...]

The initialization of t2 is invalid (and violates a constraint) because struct s is treated as if it did not contain member d .

[...]

Nevertheless, it cannot appear in strictly conforming code.

So, to clarify: Standard C specifies none of these concepts (stack, rodata, assembly) that you expect to be able to change. Therefore, unless you have a compiler that allows you to change these things (*cough* GCC), you can't change them. The compiler has full freedom to change whatever it wants as long as a valid program (without implementation-defined, unspecified, or undefined behavior) behaves the same way.

It is a comment not an answer(since I don't have enough reputation to comment on the question). I am just curious why wont this work.

typedef struct {
  const char *data;
  unsigned char len;
} PascalString;
const PascalString s = { "new string", strlen("new string")};

I am not sure why you would want to do it, but you could do it this way. This method will store your string in the data segment and gives you a way to access it as a structure. Note that I create a packed structure to ensure that the mapping into the structure always works since I have essentially hard coded the data fields in the const expression below.

#include <stdio.h>

#pragma packed(1)
typedef struct {
   unsigned char Length;
   char Data[];
} PascalString;
#pragma pack()

const unsigned char HELLO[7] = { 
0x06,
'H','E','L','L','O','\0'
};


int main(void) {
        PascalString *  myString = (PascalString *)HELLO;
        printf("I say: %s \n", myString->Data);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM