简体   繁体   English

使用 constexpr 过度渴望 C++ 联合零初始化

[英]Overly eager C++ union zero-initialization with constexpr

Below is a downstripped example of a tagged union template "Storage", which can assume two types L and R enclosed in a union, plus a bool indicating which of them is stored.下面是一个标记联合模板“存储”的精简示例,它可以假设两个类型 L 和 R 包含在一个联合中,加上一个布尔值,指示存储其中的哪一个。 The instantiation uses two different sized types, the smaller one actually being empty.实例化使用两种不同大小的类型,较小的一种实际上是空的。

#include <utility>

struct Empty
{
};

struct Big
{
        long a;
        long b;
        long c;
};

template<typename L, typename R>
class Storage final
{
public:
        constexpr explicit Storage(const R& right) : payload{right}, isLeft{false}
        {
        }

private:
        union Payload
        {
                constexpr Payload(const R& right) : right{right}
                {
                }
                L left;
                R right;
        };

        Payload payload;
        bool isLeft;
};

// Toggle constexpr here
constexpr static Storage<Big, Empty> createStorage()
{
        return Storage<Big, Empty>{Empty{}};
}

Storage<Big, Empty> createStorage2()
{        
        return createStorage();
}
  • The constructor initializes the R-member with Empty, and is only calling the union's constructor for that member构造函数用 Empty 初始化 R 成员,并且只调用该成员的联合构造函数
  • The union is never default initialized as a whole联合永远不会默认初始化为一个整体
  • All constructors are constexpr所有构造函数都是 constexpr

The function "createStorage2" should therefor only populate the bool tag, and leave the union alone.因此,函数“createStorage2”应该只填充 bool 标签,而不管联合。 So I would expect a compile result with default optimization "-O":所以我希望使用默认优化“-O”的编译结果:

createStorage2():
        mov     rax, rdi
        mov     BYTE PTR [rdi+24], 0
        ret

Both GCC and ICC instead generate something like GCC 和 ICC 都生成类似

createStorage2():
        mov     rax, rdi
        mov     QWORD PTR [rdi], 0
        mov     QWORD PTR [rdi+8], 0
        mov     QWORD PTR [rdi+16], 0
        mov     QWORD PTR [rdi+24], 0
        ret

zeroing the entire 32 byte structure, while clang generates the expected code.将整个 32 字节结构归零,而 clang 生成预期的代码。 You can reproduce this with https://godbolt.org/z/VsDQUu .您可以使用https://godbolt.org/z/VsDQUu重现此内容。 GCC will revert to the desired initialization of the bool tag only, when you remove constexpr from the "createStorage" static function, while ICC remains unimpressed and still fills all 32 bytes.当您从“createStorage”静态函数中删除 constexpr 时,GCC 将仅恢复到 bool 标记的所需初始化,而 ICC 仍然不受影响并仍填充所有 32 个字节。

Doing so is probably not a standard violation, as unused bits being "undefined" allows anything, including being set to zero and consuming unnecessary CPU cycles.这样做可能不是标准违规,因为未使用的位被“未定义”允许任何事情,包括设置为零和消耗不必要的 CPU 周期。 But it's annoying, if you introduced the union for efficiency reason in first place, and your union members vary largely in size.但是很烦人,如果您首先出于效率原因引入工会,并且您的工会成员的规模差异很大。

What is going on here?这里发生了什么? Is the any way to work around this behavior, provided that removing constexpr from constructors and the static function is not an option?如果从构造函数和静态函数中删除 constexpr 不是一种选择,是否有任何方法可以解决此行为?

A side note: ICC seems to perform some extra operations even when all constexpr are removed, as in https://godbolt.org/z/FnjoPC :旁注:即使删除了所有 constexpr,ICC 似乎也会执行一些额外的操作,如https://godbolt.org/z/FnjoPC

createStorage2():
        mov       rax, rdi                                      #44.16
        mov       BYTE PTR [-16+rsp], 0                         #39.9
        movups    xmm0, XMMWORD PTR [-40+rsp]                   #44.16
        movups    xmm1, XMMWORD PTR [-24+rsp]                   #44.16
        movups    XMMWORD PTR [rdi], xmm0                       #44.16
        movups    XMMWORD PTR [16+rdi], xmm1                    #44.16
        ret                                                     #44.16

What is the purpose of these movups instructions?这些movups指令的目的是什么?

(This is just speculation of mine, but it's too long for a comment) (这只是我的猜测,但评论太长了)

What is going on here?这里发生了什么?

Since constructors are constexpr , it could be that the Payload as a whole has some value computed at compile-time.由于构造函数是constexpr ,因此Payload整体上可能有一些在编译时计算的值。 Then, at runtime, that complete Payload is returned.然后,在运行时,返回完整的Payload To my knowledge, it is not required for a compiler to recognize that a certain portion of a compile-time value is uninitialized and that it should generate no code for it.据我所知,编译器不需要识别编译时值的某个部分未初始化并且不应该为其生成任何代码。

In some crazy compiler it could even happen that the compile-time Payload has garbage values in an uninitialized section, and then it would produce for example:在一些疯狂的编译器中,甚至可能发生编译时Payload在未初始化部分中具有垃圾值的情况,然后它会产生例如:

createStorage2():
        mov     rax, rdi
        mov     QWORD PTR [rdi], 0xbaadf00d
        mov     QWORD PTR [rdi+8], 0xbaadf00d
        mov     QWORD PTR [rdi+16], 0xbaadf00d
        mov     QWORD PTR [rdi+24], 0
        ret

In general constexpr doesn't like uninitialized values, but unions are a way around it a bit.通常constexpr不喜欢未初始化的值,但联合是一种解决方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM