简体   繁体   English

如何在 rodata 中初始化一个灵活的数组并创建一个指向它的指针?

[英]How can I initialize a flexible array in rodata and create a pointer to it?

In C, the code在 C 中,代码

char *c = "Hello world!";

stores Hello world!\0 in rodata and initializes c with a pointer to it.Hello world!\0存储在 rodata 中,并使用指向它的指针初始化c How can I do this with something other than a string?我怎么能用字符串以外的东西来做到这一点?

Specifically, I am trying to define my own string type具体来说,我正在尝试定义自己的字符串类型

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

And then want some sort of macro so that I can say然后想要某种宏,这样我就可以说

const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

And have it behave the same, in that \x0c\0\0\0Hello world!并让它表现相同,在那个\x0c\0\0\0Hello world! is stored in rodata and c2 is initialized with a pointer to it.存储在rodata中, c2用指向它的指针初始化。

I tried using我尝试使用

#define PASCAL_STRING_CONSTANT(c_string_constant) \
    &((const PascalString) { \
        .Length=sizeof(c_string_constant)-1, \
        .Data=(c_string_constant), \
    })

as suggested in these questions , but it doesn't work because Data is a flexible array: I get the error error: non-static initialization of a flexible array member (with gcc, clang gives a similar error).正如这些问题中所建议的那样,但它不起作用,因为Data is a flexible array: I get the error error: non-static initialization of a flexible array member (with gcc, clang 给出了类似的错误)。

Is this possible in C?这在 C 中是否可行? And if so, what would the PASCAL_STRING_CONSTANT macro look like?如果是这样, PASCAL_STRING_CONSTANT宏会是什么样子?

To clarify澄清

With a C string, the following code-block never stores the string on the stack:对于 C 字符串,以下代码块永远不会将字符串存储在堆栈中:

#include <inttypes.h>
#include <stdio.h>

int main(void) {
    const char *c = "Hello world!";

    printf("test %s", c);

    return 0;
}

As we can see by looking at the assembly , line 5 compiles to just loading a pointer into a register.正如我们通过查看程序集所看到的,第 5 行编译为仅将指针加载到寄存器中。

I want to be able to get that same behavior with pascal strings, and using GNU extensions it is possible to.我希望能够使用 pascal 字符串获得相同的行为,并且可以使用 GNU 扩展。 The following code also never stores the pascal-string on the stack:以下代码也从不将帕斯卡字符串存储在堆栈中:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

Looking at its generated assembly , line 18 is also just loading a pointer.查看其生成的程序集,第 18 行也只是加载了一个指针。

However, the best code I've found to do this in ANSI C produces code to copy the entire string onto the stack:但是,我发现在 ANSI C 中执行此操作的最佳代码会生成将整个字符串复制到堆栈的代码:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(initial_value) \
    (const PascalString *)&(const struct { \
        uint32_t Length; \
        char Data[sizeof(initial_value)]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

In the generated assembly for this code , line 19 copies the entire struct onto the stack then produces a pointer to it.为此代码生成的程序集中,第 19 行将整个结构复制到堆栈上,然后生成指向它的指针。

I'm looking for either ANSI C code that produces the same assembly as my second example, or an explanation of why that's not possible with ANSI C.我正在寻找生成与我的第二个示例相同的组件的 ANSI C 代码,或者解释为什么 ANSI C 无法实现。

This can be done with the statment-expressions GNU extension, although it is nonstandard.这可以通过statment-expressions GNU 扩展来完成,尽管它是非标准的。

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

The extension allows you to have multiple statements in a block as an expression which evaluates to the value of the last statement by enclosing the block in ({... }) .该扩展允许您在一个块中拥有多个语句作为表达式,该表达式通过将块括在({... })中来计算最后一条语句的值。 Thus, we can declare our PascalString as a static const value, and then return a pointer to it.因此,我们可以将PascalString声明为static const量值,然后返回指向它的指针。

For completeness, we can also make a stack buffer if we want to modify it:为了完整起见,如果我们想修改它,我们也可以创建一个堆栈缓冲区:

#define PASCAL_STRING_STACKBUF(initial_value, capacity) \
    (PascalString *)&(struct { \
        uint32_t Length; \
        char Data[capacity]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }

You can use this macro, which names the name of the variable on its contents:您可以使用此宏,它在其内容上命名变量的名称:

#define PASCAL_STRING(name, str) \
    struct { \
        unsigned char len; \
        char content[sizeof(str) - 1]; \
    } name = { sizeof(str) - 1, str }

To create such a string.创建这样的字符串。 Use it like this:像这样使用它:

const PASCAL_STRING(c2, "Hello world!");

The answer is that no, you cannot initialize a flexible array in.rodata and create a pointer to it in plain C.答案是不,你不能初始化一个灵活的数组 in.rodata 并在普通的 C 中创建一个指向它的指针。

There are a few reasons for this;这有几个原因; as a starting point, standard C doesn't specify a .rodata section.作为起点,标准 C 没有指定.rodata部分。 Another reason is that something similar could be implemented almost equivalently with pointers.另一个原因是类似的东西可以用指针几乎等价地实现。

There are many solutions to this, including allocating the memory with malloc , using a (somewhat) fixed size for the Data array, or using statement expressions, but you have ruled these out (as they don't store the result in .rodata (aka they store it in the stack) or they use GNU extensions).有很多解决方案,包括分配 memory 和malloc ,使用(有点)固定大小的Data数组,或使用语句表达式,但您已经排除了这些(因为它们不会将结果存储在.rodata中(也就是他们将它存储在堆栈中)或者他们使用 GNU 扩展)。 Therefore, no portable solution will be able to do exactly what you want.因此,任何便携式解决方案都无法完全满足您的需求。

The C standard specifies that you can't initialize a flexible array member in ISO/IEC 9899:1999 section 6.7.2.1 point 18: C 标准规定您不能在 ISO/IEC 9899:1999 第 6.7.2.1 节第 18 点中初始化灵活数组成员:

 struct s { int n; double d[]; };

[...] [...]

 struct s t2 = { 1, { 4.2 }}; // invalid

[...] [...]

The initialization of t2 is invalid (and violates a constraint) because struct s is treated as if it did not contain member d . t2的初始化是无效的(并且违反了约束),因为struct s被视为不包含成员d

[...] [...]

Nevertheless, it cannot appear in strictly conforming code.然而,它不能出现在严格符合的代码中。

So, to clarify: Standard C specifies none of these concepts (stack, rodata, assembly) that you expect to be able to change.因此,澄清一下:标准 C 没有指定您希望能够更改的这些概念(堆栈、rodata、组件)。 Therefore, unless you have a compiler that allows you to change these things (*cough* GCC), you can't change them.因此,除非您有一个允许您更改这些内容的编译器(*cough* GCC),否则您无法更改它们。 The compiler has full freedom to change whatever it wants as long as a valid program (without implementation-defined, unspecified, or undefined behavior) behaves the same way.只要有效程序(没有实现定义的、未指定的或未定义的行为)以相同的方式运行,编译器就可以完全自由地更改它想要的任何内容。

It is a comment not an answer(since I don't have enough reputation to comment on the question).这是评论而不是答案(因为我没有足够的声誉来评论这个问题)。 I am just curious why wont this work.我只是好奇为什么这不起作用。

typedef struct {
  const char *data;
  unsigned char len;
} PascalString;
const PascalString s = { "new string", strlen("new string")};

I am not sure why you would want to do it, but you could do it this way.我不确定你为什么要这样做,但你可以这样做。 This method will store your string in the data segment and gives you a way to access it as a structure.此方法会将您的字符串存储在数据段中,并为您提供一种将其作为结构访问的方法。 Note that I create a packed structure to ensure that the mapping into the structure always works since I have essentially hard coded the data fields in the const expression below.请注意,我创建了一个打包结构以确保映射到该结构中始终有效,因为我基本上已经在下面的 const 表达式中对数据字段进行了硬编码。

#include <stdio.h>

#pragma packed(1)
typedef struct {
   unsigned char Length;
   char Data[];
} PascalString;
#pragma pack()

const unsigned char HELLO[7] = { 
0x06,
'H','E','L','L','O','\0'
};


int main(void) {
        PascalString *  myString = (PascalString *)HELLO;
        printf("I say: %s \n", myString->Data);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM