严格别名和二进制I / O.

Question

让我们考虑以下（简化）代码来读取二进制文件的内容：

struct Header
{
    char signature[8];
    uint32_t version;
    uint32_t numberOfSomeChunks;
    uint32_t numberOfSomeOtherChunks;
};

void readFile(std::istream& stream)
{
    // find total size of the file, in bytes:
    stream.seekg(0, std::ios::end);
    const std::size_t totalSize = stream.tellg();

    // allocate enough memory and read entire file
    std::unique_ptr<std::byte[]> fileBuf = std::make_unique<std::byte[]>(totalSize);
    stream.seekg(0);
    stream.read(reinterpret_cast<char*>(fileBuf.get()), totalSize);

    // get the header and do something with it:
    const Header* hdr = reinterpret_cast<const Header*>(fileBuf.get());

    if(hdr->version != expectedVersion) // <- Potential UB?
    {
        // report the error
    }

    // and so on...
}

我看到这个的方式，如下：

if(hdr->version != expectedVersion) // <- Potential UB?

包含未定义的行为：我们正在读取类型为uint32_t version成员，它覆盖在std::byte对象数组的顶部，编译器可以自由地假设uint32_t对象没有别名。

问题是：我的解释是否正确？ 如果是，可以采取哪些措施来修复此代码？ 如果不是，为什么这里没有UB？

注1：我理解严格别名规则的目的（允许编译器避免从内存中不必要的加载）。另外，我知道在这种情况下使用std::memcpy是一个安全的解决方案 - 但是使用std::memcpy意味着我们必须进行额外的内存分配（在堆栈上，或者如果对象的大小未知则在堆上））。

Answer 1

问题是：我的解释是否正确？

是。

如果是，可以采取哪些措施来修复此代码？

您已经知道memcpy是一种解决方案。 但是，您可以通过直接读取头对象来跳过memcpy和额外的内存分配：

Header h;
stream.read(reinterpret_cast<char*>(&h), sizeof h);

请注意，以这种方式读取二进制文件意味着文件的整数表示必须与CPU的表示形式匹配。 这意味着该文件无法移植到具有不同CPU架构的系统。

Answer 2

可以做些什么来修复这段代码？

等到http://wg21.link/P0593或类似的东西允许在char / unsigned char / std::byte数组中创建隐式对象。

严格别名和二进制I / O.

问题描述

2 个解决方案

解决方案1
3 2019-01-23 23:12:58

解决方案2
0 2019-01-24 13:24:37

严格别名和二进制I / O.

问题描述

2 个解决方案

解决方案1 3 2019-01-23 23:12:58

解决方案2 0 2019-01-24 13:24:37

解决方案1
3 2019-01-23 23:12:58

解决方案2
0 2019-01-24 13:24:37