[英]Standard-layout and tail padding
David Hollman recently tweeted the following example (which I've slightly reduced): 大卫霍尔曼最近在推特上发布了以下示例(我稍微减少了):
struct FooBeforeBase {
double d;
bool b[4];
};
struct FooBefore : FooBeforeBase {
float value;
};
static_assert(sizeof(FooBefore) > 16);
//----------------------------------------------------
struct FooAfterBase {
protected:
double d;
public:
bool b[4];
};
struct FooAfter : FooAfterBase {
float value;
};
static_assert(sizeof(FooAfter) == 16);
You can examine the layout in clang on godbolt and see that the reason the size changed is that in FooBefore
, the member value
is placed at offset 16 (maintaining a full alignment of 8 from FooBeforeBase
) whereas in FooAfter
, the member value
is placed at offset 12 (effectively using FooAfterBase
's tail-padding). 您可以检查godbolt上的clang中的布局,并查看大小更改的原因是在
FooBefore
,成员value
放置在偏移量16处(保持与FooBeforeBase
的完全对齐),而在FooAfter
,成员value
放置在offset 12(有效地使用FooAfterBase
的尾部填充)。
It is clear to me that FooBeforeBase
is standard-layout, but FooAfterBase
is not (because its non-static data members do not all have the same access control, [class.prop]/3 ). 我很清楚,
FooBeforeBase
是标准布局,但FooAfterBase
不是(因为它的非静态数据成员并不都具有相同的访问控制, [class.prop] / 3 )。 But what is it about FooBeforeBase
's being standard-layout that requires this respect of padding bytes? 但是,关于
FooBeforeBase
的标准布局是什么呢?这需要填充字节的这方面呢?
Both gcc and clang reuse FooAfterBase
's padding, ending up with sizeof(FooAfter) == 16
. gcc和clang都重用了
FooAfterBase
的填充,最后是sizeof(FooAfter) == 16
。 But MSVC does not, ending up with 24. Is there a required layout per the standard and, if not, why do gcc and clang do what they do? 但是MSVC没有,结果是24.每个标准是否有必要的布局,如果没有,为什么gcc和clang做他们做的事情?
There is some confusion, so just to clear up: 有一些混乱,所以只是为了清理:
FooBeforeBase
is standard-layout FooBeforeBase
是标准布局 FooBefore
is not (both it and a base class have non-static data members, similar to E
in this example ) FooBefore
不是 (它和基类都有非静态数据成员,在本例中类似于E
) FooAfterBase
is not (it has non-static data members of differing access) FooAfterBase
不是 (它具有不同访问权限的非静态数据成员) FooAfter
is not (for both of the above reasons) FooAfter
不是 (出于上述两个原因) The answer to this question doesn't come from the standard but rather from the Itanium ABI (which is why gcc and clang have one behavior but msvc does something else). 这个问题的答案并非来自标准,而是来自Itanium ABI(这就是为什么gcc和clang有一种行为,但msvc做了别的事情)。 That ABI defines a layout , the relevant parts of which for the purposes of this question are:
ABI定义了一个布局 ,为了这个问题,其相关部分是:
For purposes internal to the specification, we also specify:
对于规范内部的目的,我们还指定:
- dsize (O): the data size of an object, which is the size of O without tail padding.
dsize (O):对象的数据大小 ,是没有尾部填充的O的大小。
and 和
We ignore tail padding for PODs because an early version of the standard did not allow us to use it for anything else and because it sometimes permits faster copying of the type.
我们忽略POD的尾部填充,因为标准的早期版本不允许我们将其用于其他任何东西,因为它有时允许更快地复制该类型。
Where the placement of members other than virtual base classes is defined as: 将虚拟基类以外的成员放置定义为:
Start at offset dsize(C), incremented if necessary for alignment to nvalign(D) for base classes or to align(D) for data members.
从偏移dsize(C)开始,如果需要,则增加以对齐基类的nvalign(D)或对齐数据成员(D)。 Place D at this offset unless [... not relevant ...].
除非[...不相关...],否则将D放在此偏移处。
The term POD has disappeared from the C++ standard, but it means standard-layout and trivially copyable. 术语POD已从C ++标准中消失,但它意味着标准布局和平凡的可复制。 In this question,
FooBeforeBase
is a POD. 在这个问题中,
FooBeforeBase
是一个POD。 The Itanium ABI ignores tail padding - hence dsize(FooBeforeBase)
is 16. Itanium ABI忽略尾部填充 - 因此
dsize(FooBeforeBase)
为16。
But FooAfterBase
is not a POD (it is trivially copyable, but it is not standard-layout). 但是
FooAfterBase
不是POD(它可以轻易复制,但它不是标准布局)。 As a result, tail padding is not ignored, so dsize(FooAfterBase)
is just 12, and the float
can go right there. 因此,不会忽略尾部填充,因此
dsize(FooAfterBase)
只有12, float
可以直接到那里。
This has interesting consequences, as pointed out by Quuxplusone in a related answer , implementors also typically assume that tail padding isn't reused, which wreaks havoc on this example: 这有一些有趣的结果,正如Quuxplusone在相关答案中所指出的,实现者通常也认为尾部填充不会被重用,这会对这个例子造成严重破坏:
#include <algorithm> #include <stdio.h> struct A { int m_a; }; struct B : A { int m_b1; char m_b2; }; struct C : B { short m_c; }; int main() { C c1 { 1, 2, 3, 4 }; B& b1 = c1; B b2 { 5, 6, 7 }; printf("before operator=: %d\\n", int(c1.m_c)); // 4 b1 = b2; printf("after operator=: %d\\n", int(c1.m_c)); // 4 printf("before std::copy: %d\\n", int(c1.m_c)); // 4 std::copy(&b2, &b2 + 1, &b1); printf("after std::copy: %d\\n", int(c1.m_c)); // 64, or 0, or anything but 4 }
Here, =
does the right thing (it does not override B
's tail padding), but copy()
has a library optimization that reduces to memmove()
- which does not care about tail padding because it assumes it does not exist. 这里,
=
做正确的事情(它不会覆盖B
的尾部填充),但是copy()
有一个库优化,减少到memmove()
- 它不关心尾部填充,因为它假定它不存在。
FooBefore derived;
FooBeforeBase src, &dst=derived;
....
memcpy(&dst, &src, sizeof(dst));
If the additional data member was placed in the hole, memcpy
would have overwritten it. 如果附加数据成员放在洞中,
memcpy
会覆盖它。
As is correctly pointed out in comments, the standard doesn't require that this memcpy
invocation should work. 正如在注释中正确指出的那样,该标准不要求此
memcpy
调用应该起作用。 However the Itanium ABI is seemingly designed with this case in mind. 然而,Itanium ABI似乎是考虑到这种情况而设计的。 Perhaps the ABI rules are specified this way in order to make mixed-language programming a bit more robust, or to preserve some kind of backwards compatibility.
也许ABI规则是以这种方式指定的,以便使混合语言编程更加健壮,或者保持某种向后兼容性。
Relevant ABI rules can be found here . 可以在此处找到相关的ABI规则。
A related answer can be found here (this question might be a duplicate of that one). 可在此处找到相关答案(此问题可能与该问题重复)。
Here is a concrete case which demonstrates why the second case cannot reuse the padding: 这是一个具体的案例,它说明了为什么第二种情况不能重复使用填充:
union bob {
FooBeforeBase a;
FooBefore b;
};
bob.b.value = 3.14;
memset( &bob.a, 0, sizeof(bob.a) );
this cannot clear bob.b.value
. 这无法清除
bob.b.value
。
union bob2 {
FooAfterBase a;
FooAfter b;
};
bob2.b.value = 3.14;
memset( &bob2.a, 0, sizeof(bob2.a) );
this is undefined behavior. 这是未定义的行为。
FooBefore
is not std-layout either; FooBefore
是std-layout; two classes are declaring none-static data members( FooBefore
and FooBeforeBase
). 两个类声明非静态数据成员(
FooBefore
和FooBeforeBase
)。 Thus the compiler is allowed to arbitrarily place some data members. 因此,允许编译器任意放置一些数据成员。 Hence the differences on different tool-chains arise.
因此,出现了不同工具链的差异。 In a std-layout hierarchy, atmost one class(either the most derived class or at most one intermediate class) shall declare none-static data members.
在std-layout层次结构中,最多一个类(最多派生类或最多一个中间类)应声明非静态数据成员。
Here's a similar case as nm's answer. 这是与nm的答案类似的情况。
First, let's have a function, which clears a FooBeforeBase
: 首先,让我们有一个函数,它清除一个
FooBeforeBase
:
void clearBase(FooBeforeBase *f) {
memset(f, 0, sizeof(*f));
}
This is fine, as clearBase
gets a pointer to FooBeforeBase
, it thinks that as FooBeforeBase
has standard-layout, so memsetting it is safe. 这是正常,因为
clearBase
得到一个指针FooBeforeBase
,它认为,作为FooBeforeBase
具有标准的布局,所以memsetting它是安全的。
Now, if you do this: 现在,如果你这样做:
FooBefore b;
b.value = 42;
clearBase(&b);
You don't expect, that clearBase
will clear b.value
, as b.value
is not part of FooBeforeBase
. 你没想到,
clearBase
会清除b.value
,因为b.value
不是FooBeforeBase
一部分。 But, if FooBefore::value
was put into tail-padding of FooBeforeBase
, it would been cleared as well. 但是,如果
FooBefore::value
投入的尾填充FooBeforeBase
,它会被清除,以及。
Is there a required layout per the standard and, if not, why do gcc and clang do what they do?
每个标准是否有必要的布局,如果没有,为什么gcc和clang做他们做的事情?
No, tail-padding is not required. 不,不需要尾部填充。 It is an optimization, which gcc and clang do.
这是一个优化,gcc和clang做的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.