简体   繁体   English

严格的别名和对编译时的引用 C arrays

[英]Strict aliasing and references to compile-time C arrays

Given the following code给定以下代码

#include <cassert>
#include <climits>
#include <cstdint>
#include <iostream>

static_assert(CHAR_BIT == 8, "A byte does not consist of 8 bits");

void func1(const int32_t& i)
{
    const unsigned char* j = reinterpret_cast<const unsigned char*>(&i);
    for(int k = 0; k < 4; ++k)
        std::cout << static_cast<int>(j[k]) << ' ';
    std::cout << '\n';
}

void func2(const int32_t& i)
{
    const unsigned char (&j)[4] = reinterpret_cast<const unsigned char (&)[4]>(i);
    for(int k = 0; k < 4; ++k)
        std::cout << static_cast<int>(j[k]) << ' ';
    std::cout << '\n';
}

int main() {
    func1(-1);
    func2(-1);
}

From the language rules it is clear that func1 is fine, as pointers to unsigned char can alias any other type.从语言规则中可以清楚地看出func1很好,因为指向unsigned char的指针可以别名任何其他类型。 My question is: does this extend to C++ references to C-arrays with known length?我的问题是:这是否扩展到 C++ 对已知长度的 C 数组的引用? Intuitively I would say yes.直觉上我会说是的。 Is func2 well-defined or does it trigger undefined behavior? func2是明确定义的还是会触发未定义的行为?

I have tried compiling the above code using Clang and GCC with every possible combination of -Wextra -Wall -Wpedantic and UBSAN, and have gotten no warnings and always the same output.我尝试使用 Clang 和 GCC 编译上述代码,并使用-Wextra -Wall -Wpedantic和 UBSAN 的所有可能组合,并且没有收到任何警告,并且始终相同 Z78E6221F6393D135CEDZ1DB3993D135CED8668。 That obviously doesn't state that there's no UB, but I couldn't trigger any of the usual strict-aliasing type optimization bugs.这显然不是 state 没有 UB,但我无法触发任何通常的严格混叠类型优化错误。

It's undefined behavior.这是未定义的行为。

On the meaning of reinterpret_cast here we have [expr.reinterpret.cast]关于这里reinterpret_cast的含义,我们有[expr.reinterpret.cast]

11 A glvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast. 11如果可以使用 reinterpret_cast 将“指向 T1 的指针”类型的表达式显式转换为“指向 T2 的指针”类型,则可以将类型 T1 的泛左值表达式强制转换为类型“对 T2 的引用”。 The result refers to the same object as the source glvalue, but with the specified type.结果引用与源泛左值相同的 object,但具有指定的类型。 [ Note: That is, for lvalues, a reference cast reinterpret_cast(x) has the same effect as the conversion *reinterpret_cast(&x) with the built-in & and * operators (and similarly for reinterpret_cast(x)). [注意:也就是说,对于左值,引用转换 reinterpret_cast(x) 与使用内置 & 和 * 运算符的转换 *reinterpret_cast(&x) 具有相同的效果(对于 reinterpret_cast(x) 也是如此)。 — end note ] No temporary is created, no copy is made, and constructors or conversion functions are not called. — 尾注] 不创建临时文件,不制作副本,也不调用构造函数或转换函数。

This tells us that the cast int func2 is valid so long as reinterpret_cast<const unsigned char (*)[4]>(&i) is valid.这告诉我们,只要reinterpret_cast<const unsigned char (*)[4]>(&i)有效,强制转换 int func2就有效。 No shock here.这里没有震惊。 But the crux of the matter is that you may not get anything meaningful out of that pointer conversion.但问题的关键在于,您可能无法从指针转换中获得任何有意义的东西。 On that subject we have this over at [basic.compound] :关于这个问题,我们在[basic.compound]有这个:

4 Two objects a and b are pointer-interconvertible if: 4如果满足以下条件,两个对象 a 和 b 是指针可互转换的:

  • they are the same object, or它们是相同的 object,或
  • one is a standard-layout union object and the other is a non-static data member of that object ([class.union]), or一个是标准布局联合 object,另一个是该 object ([class.union]) 的非静态数据成员,或
  • one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object ([class.mem]), or one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object ([class.mem] ), 或者
  • there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible. there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.

If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast .如果两个对象是指针可互转换的,那么它们具有相同的地址,并且可以通过reinterpret_cast从指向另一个对象的指针获取指向其中一个对象的指针。 [ Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. [注意:数组 object 及其第一个元素不是指针可互转换的,即使它们具有相同的地址。 — end note ] ——尾注]

That's an exhaustive list of meaningful pointer conversions.这是有意义的指针转换的详尽列表。 So we are not permitted to obtain an array address like that, and as such it is not a valid array glvalue.因此,我们不允许获取这样的数组地址,因此它不是有效的数组 glvalue。 Therefore the further use you make of the result of the cast is undefined.因此,您对转换结果的进一步使用是未定义的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM