简体   繁体   English

C 虚拟结构,严格别名和 static 初始化

[英]C dummy struct, strict aliasing and static initialization

My first question wasn't well formulated so here goes again, this time, more well asked and explained.我的第一个问题没有很好地表述,所以这次又来了,被问得更清楚,解释得更清楚。

I want to hide the variables of a struct while being able to initialize the struct statically on the stack.我想隐藏结构的变量,同时能够在堆栈上静态初始化结构。 Most solutions out there use the opaque pointer idiom and dynamic memory allocation which isn't always desired.大多数解决方案都使用不透明指针习语和动态 memory 分配,这并不总是需要的。

The idea for this example came from the following post:这个例子的想法来自以下帖子:

https://www.reddit.com/r/C_Programming/comments/aimgei/opaque_types_and_static_allocation/ https://www.reddit.com/r/C_Programming/comments/aimgei/opaque_types_and_static_allocation/

I know that this is probably ub but I believe it should work fine in most consumers archictures: either 32 bit or 64 bit.我知道这可能是ub ,但我相信它应该在大多数消费者架构中都能正常工作:32 位或 64 位。

Now you may tell me that sometimes size_t may be bigger than void * and that the void * alignment in the union forcing the union alignment to be that of sizeof(void *) may be wrong, but usually that's never case, maybe it can happen but I see it as the exception not the rule.现在您可能会告诉我,有时size_t可能大于void *并且联合中的void * alignment union union alignment 为sizeof(void *)可能是错误的,但通常情况并非如此,也许它会发生但我认为这是例外而不是规则。

Based on the fact that most compilers add padding to align it to either a multiple of 4 or 8 depending on your architecture and that sizeof returns the correct size with padding, sizeof(Vector) and sizeof(RealVector) should be the same, and based on the fact that both Vector and RealVector have the same alignment it should be fine too.基于大多数编译器添加填充以将其对齐到 4 或 8 的倍数,具体取决于您的架构,并且sizeof使用填充返回正确的大小, sizeof(Vector)sizeof(RealVector)应该相同,并且基于事实上VectorRealVector具有相同的 alignment 也应该没问题。

If this is ub , how can I create a sort of scratchpad structure in C in a safe maner?如果这是ub ,我如何以安全的方式在C中创建一种暂存器结构? In C++ we have alignas , alignof and placement new which hepls making this ordeal a lot more safer.在 C++ 中,我们有alignasalignof和 place placement new ,这有助于使这种考验更加安全。

If that's not possible to do in C99 , will it be more safer in C11 with alignas and alignof ?如果在C99中无法做到这一点,那么在C11中使用alignasalignof会更安全吗?

#include <stdint.h>
#include <stdio.h>

/* In .h */

typedef union Vector {
    uint8_t data[sizeof(void *) + 2 * sizeof(size_t)];
    /* this is here to the force the alignment of the union to that of sizeof(void *) */
    void * alignment;
} Vector;

void vector_initialize_version_a(Vector *);
void vector_initialize_version_b(Vector *);
void vector_debug(Vector const *);

/* In .c */

typedef struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
} RealVector;

void
vector_initialize_version_a(Vector * const t) {
    RealVector * const v = (RealVector *)t;
    v->data = NULL;
    v->length = 0;
    v->capacity = 8;
}

void
vector_initialize_version_b(Vector * const t) {
    *(RealVector *)t = (RealVector) {
        .data = NULL,
        .length = 0,
        .capacity = 16,
    };
}

void
vector_debug(Vector const * const t) {
    RealVector * v = (RealVector *)t;
    printf("Length: %zu\n", v->length);
    printf("Capacity: %zu\n", v->capacity);
}

/* In main.c */

int
main() {
    /*
    Compiled with:
    clang -std=c99 -O3 -Wall -Werror -Wextra -Wpedantic test.c -o main.exe
    */

    printf("%zu == %zu\n", sizeof(Vector), sizeof(RealVector));

    Vector vector;

    vector_initialize_version_a(&vector);
    vector_debug(&vector);

    vector_initialize_version_b(&vector);
    vector_debug(&vector);

    return 0;
}

Why nor simple?为什么也不简单? It avoids the pointer punning它避免了指针双关语

typedef struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
} RealVector;

typedef struct Vector {
    uint8_t data[sizeof(RealVector)];
} Vector;

typedef union
{
    Vector      v;
    RealVector rv;
} RealVector_union;

void vector_initialize_version_a(void * const t) {
    RealVector_union * const v = t;
    v -> rv.data = NULL;
    v -> rv.length = 0;
    v -> rv.capacity = 8;
}

And

I'll post my answer from the previous question, which I didn't have to time to post:)我将发布上一个问题的答案,我没有时间发布:)

Am I safe doing this?我这样做安全吗?

No, you are not.不,你不是。 But instead of finding a way of doing it safe, just error when it's not safe:但是,与其找到一种安全的方法,不如在不安全时出错:

#include <assert.h>
#include <stdalign.h>
static_assert(sizeof(Vector) == sizeof(RealVector), "");
static_assert(alignof(Vector) == alignof(RealVector), "");

With checks written in that way, you will know beforehand when there's going to be a problem, and you can then fix it handling the specific environment.通过以这种方式编写的检查,您将事先知道何时会出现问题,然后您可以处理特定环境来修复它。 And if the checks will not fire, you will know it's fine.如果支票不会触发,你就会知道没关系。

how can I create a sort of scratchpad structure in C in a safe maner?如何以安全的方式在 C 中创建一种暂存器结构?

The only correct way of really doing it safe would be a two step process:真正做到安全的唯一正确方法是两步过程:

  • first compile a test executable that would output the size and alignment of struct RealVector首先编译一个测试可执行文件,它将 output 的大小和结构 RealVector 的struct RealVector
  • then generate the header file with proper structure definition struct Vector { alignas(REAL_VECTOR_ALIGNMENT) unigned char data[REAL_VECTOR_SIZE]; };然后生成具有正确结构定义的 header 文件struct Vector { alignas(REAL_VECTOR_ALIGNMENT) unigned char data[REAL_VECTOR_SIZE]; }; struct Vector { alignas(REAL_VECTOR_ALIGNMENT) unigned char data[REAL_VECTOR_SIZE]; };
  • and then continue to compiling the final executable然后继续编译最终的可执行文件
  • Compilation of test and final executables has to be done using the same compiler options, version and settings and environment.必须使用相同的编译器选项、版本和设置以及环境来编译测试和最终可执行文件。

Notes:笔记:

  • Instead of union use struct with alignof而不是union使用带有alignofstruct
  • uint8_t is an integer with 8-bits. uint8_t是一个 8 位的 integer。 Use char , or best unsigned char , to represent "byte".使用char或最好的unsigned char来表示“字节”。
  • sizeof(void*) is not guaranteed to be sizeof(uint64_t*) sizeof(void*)不保证为sizeof(uint64_t*)
  • where max alignment is either 4 or 8 - typically on x86_64 alignof(long double) is 16. where max alignment is either 4 or 8 - 通常在 x86_64 上alignof(long double)为 16。

One possibility is to define Vector as follows in the.h file:一种可能性是在 .h 文件中按如下方式定义Vector

/* In vector.h file */
struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
};

typedef union Vector {
    char data[sizeof(struct RealVector)];
    /* these are here to the force the alignment of the union */
    uint64_t * alignment1_;
    size_t alignment2_;
} Vector;

That also defines struct RealVector for use in the vector implementation.c file:这也定义了用于向量 implementation.c 文件的struct RealVector

/* In vector.c file */
typedef struct RealVector RealVector;

This has the advantage that the binary contents of Vector actually consists of a RealVector and is correctly aligned.这样做的好处是Vector的二进制内容实际上由RealVector组成并且正确对齐。 The disadvantage is that a sneaky user could easily manipulate the contents of a Vector via pointer type casting.缺点是狡猾的用户可以通过指针类型转换轻松地操纵Vector的内容。

A not so legitimate alternative is to remove struct RealVector from the.h file and replace it with an anonymous struct type of the same shape:一个不太合法的替代方法是从 .h 文件中删除struct RealVector并将其替换为相同形状的匿名struct类型:

/* In vector.h file */
typedef union Vector {
    char data[sizeof(struct { uint64_t * a; size_t b; size_t c; })];
    /* these are here to the force the alignment of the union */
    uint64_t * alignment1_;
    size_t alignment2_;
} Vector;

Then struct RealVector needs to be fully defined in the vector implementation.c file:然后需要在vector implementation.c文件中完整定义struct RealVector

/* In vector.c file */
typedef struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
} RealVector;

This has the advantage that a sneaky user cannot easily manipulate the contents of a Vector without first defining another struct type of the same shape as the anonymous struct type.这样做的好处是,如果不首先定义另一个与匿名struct类型具有相同形状的struct类型,那么狡猾的用户就不能轻易地操纵Vector的内容。 The disadvantage is that the anonymous struct type that forms the binary representation of Vector is not technically compatible with the RealVector type used in the vector implementation.c file because the tags and member names are different.缺点是forms 的二进制表示Vector的匿名struct类型在技术上与向量实现中使用的RealVector类型不兼容。c 文件因为标签和成员名称不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM