简体   繁体   English

使用未初始化的成员复制结构

[英]Copying structs with uninitialized members

Is it valid to copy a struct some of whose members are not initialized?复制其中一些成员未初始化的结构是否有效?

I suspect it is undefined behavior, but if so, it makes leaving any uninitialized members in a struct (even if those members are never used directly) quite dangerous.我怀疑这是未定义的行为,但如果是这样,则将任何未初始化的成员留在结构中(即使这些成员从未直接使用过)非常危险。 So I wonder if there is something in the standard that allows it.所以我想知道标准中是否允许这样做。

For instance, is this valid?例如,这是有效的吗?

struct Data {
  int a, b;
};

int main() {
  Data data;
  data.a = 5;
  Data data2 = data;
}

Yes, if the uninitialized member is not an unsigned narrow character type or std::byte , then copying a struct containing this indeterminate value with the implicitly defined copy constructor is technically undefined behavior, as it is for copying a variable with indeterminate value of the same type, because of [dcl.init]/12 .是的,如果未初始化的成员不是无符号窄字符类型或std::byte ,则使用隐式定义的复制构造函数复制包含此不确定值的结构在技术上是未定义的行为,因为它用于复制具有不确定值的变量相同的类型,因为[dcl.init]/12

This applies here, because the implicitly generated copy constructor is, except for union s, defined to copy each member individually as if by direct-initialization, see [class.copy.ctor]/4 .这适用于此处,因为隐式生成的复制构造函数(除了union s 之外)被定义为单独复制每个成员,就像通过直接初始化一样,请参阅[class.copy.ctor]/4

This is also subject of the active CWG issue 2264 .这也是活动CWG 问题 2264 的主题

I suppose in practice you will not have any problem with that, though.不过,我想在实践中你不会有任何问题。

If you want to be 100% sure, using std::memcpy always has well-defined behavior if the type is trivially copyable , even if members have indeterminate value.如果您想 100% 确定,即使成员具有不确定的值,如果类型是可简单复制的,则使用std::memcpy始终具有明确定义的行为。


These issues aside, you should always initialize your class members properly with a specified value at construction anyway, assuming you don't require the class to have a trivial default constructor .撇开这些问题不谈,你应该总是在构造时使用指定的值正确初始化你的类成员,假设你不需要类有一个简单的默认构造函数 You can do so easily using the default member initializer syntax to eg value-initialize the members:您可以使用默认成员初始值设定项语法轻松完成此操作,例如对成员进行值初始化:

struct Data {
  int a{}, b{};
};

int main() {
  Data data;
  data.a = 5;
  Data data2 = data;
}

In general, copying uninitialized data is undefined behavior because that data may be in a trapping state.通常,复制未初始化的数据是未定义的行为,因为该数据可能处于陷阱状态。 Quoting this page:引用页面:

If an object representation does not represent any value of the object type, it is known as trap representation.如果对象表示不表示对象类型的任何值,则称为陷阱表示。 Accessing a trap representation in any way other than reading it through an lvalue expression of character type is undefined behavior.除了通过字符类型的左值表达式读取之外,以任何方式访问陷阱表示都是未定义的行为。

Signalling NaNs are possible for floating point types, and on some platforms integers may have trap representations.对于浮点类型,信号 NaN 是可能的,并且在某些平台上整数可能具有陷阱表示。

However, for trivially copyable types it is possible to use memcpy to copy the raw representation of the object.但是,对于可 简单复制的类型,可以使用memcpy来复制对象的原始表示。 Doing so is safe since the value of the object is not interpreted, and instead the raw byte sequence of the object representation is copied.这样做是安全的,因为不解释对象的值,而是复制对象表示的原始字节序列。

In some cases, such as the one described, the C++ Standard allows compilers to process constructs in whatever fashion their customers would find most useful, without requiring that behavior be predictable.在某些情况下,例如所描述的情况,C++ 标准允许编译器以客户认为最有用的任何方式处理构造,而不要求该行为是可预测的。 In other words, such constructs invoke "Undefined Behavior".换句话说,此类构造调用“未定义行为”。 That doesn't imply, however, that such constructs are meant to be "forbidden" since the C++ Standard explicitly waives jurisdiction over what well-formed programs are "allowed" to do.然而,这并不意味着这样的构造是“禁止的”,因为 C++ 标准明确放弃了对“允许”格式良好的程序的管辖权。 While I'm unaware of any published Rationale document for the C++ Standard, the fact that it describes Undefined Behavior much like C89 does would suggest the intended meaning is similar: "Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior".虽然我不知道 C++ 标准的任何已发布的基本原理文档,但它描述未定义行为的事实与 C89 非常相似,这表明预期的含义是相似的:“未定义的行为使实现者许可不捕获某些困难的程序错误诊断。它还确定了可能符合语言扩展的区域:实现者可以通过提供官方未定义行为的定义来扩充语言”。

There are many situations where the most efficient way to process something would involve writing the parts of a structure that downstream code is going to care about, while omitting those that downstream code isn't going to care about.在许多情况下,处理某些事情的最有效方法是编写下游代码将关心的结构部分,而忽略下游代码不会关心的那些部分。 Requiring that programs initialize all members of a structure, including those that nothing is ever going to care about, would needlessly impede efficiency.要求程序初始化结构的所有成员,包括那些永远不会关心的成员,将不必要地阻碍效率。

Further, there are some situations where it may be most efficient to have uninitialized data behave in non-deterministic fashion.此外,在某些情况下,让未初始化的数据以不确定的方式表现可能是最有效的。 For example, given:例如,给定:

struct q { unsigned char dat[256]; } x,y;

void test(unsigned char *arr, int n)
{
  q temp;
  for (int i=0; i<n; i++)
    temp.dat[arr[i]] = i;
  x=temp;
  y=temp;
}

if downstream code won't care about the values of any elements of x.dat or y.dat whose indices weren't listed in arr , the code might be optimized to:如果下游代码不关心索引未在arr列出的x.daty.dat的任何元素的值,则代码可能会优化为:

void test(unsigned char *arr, int n)
{
  q temp;
  for (int i=0; i<n; i++)
  {
    int it = arr[i];
    x.dat[index] = i;
    y.dat[index] = i;
  }
}

This improvement in efficiency wouldn't be possible if programmers were required to explicitly write every element of temp.dat , including those downstream wouldn't care about, before copying it.如果要求程序员在复制之前显式地编写temp.dat每个元素,包括那些下游不会关心的元素, temp.dat这种效率的提高是不可能的。

On the other hand, there are some applications where it's important to avoid the possibility of data leakage.另一方面,在某些应用程序中,避免数据泄露的可能性很重要。 In such applications, it may be useful to either have a version of the code that's instrumented to trap any attempt to copy uninitialized storage without regard for whether downstream code would look at it, or it might be useful to have an implementation guarantee that any storage whose contents could be leaked would get zeroed or otherwise overwritten with non-confidential data.在这样的应用程序中,有一个代码版本可能很有用,该版本被检测以捕获任何复制未初始化存储的尝试,而不管下游代码是否会查看它,或者有一个实现保证任何存储可能泄露的内容将被归零或以其他方式被非机密数据覆盖。

From what I can tell, the C++ Standard makes no attempt to say that any of these behaviors is sufficiently more useful than the other as to justify mandating it.据我所知,C++ 标准并没有试图说这些行为中的任何一个都比其他行为更有用,以证明强制执行它是合理的。 Ironically, this lack of specification may be intended to facilitate optimization, but if programmers can't exploit any kind of weak behavioral guarantees, any optimizations will be negated.具有讽刺意味的是,缺乏规范可能是为了促进优化,但如果程序员不能利用任何类型的弱行为保证,任何优化都将被否定。

Since all members of the Data are of primitive types, data2 will get exact "bit-by-bit copy" of the all members of data .由于Data所有成员都是原始类型,因此data2将获得data的所有成员的精确“逐位副本”。 So the value of data2.b will be exactly the same as value of the data.b .所以价值data2.b将是完全一样的价值data.b However, exact value of the data.b cannot be predicted, because you have not initialized it explicitly.但是,无法预测data.b确切值,因为您尚未对其进行显式初始化。 It will depend on values of the bytes in the memory region allocated for the data .它将取决于为data分配的内存区域中的字节值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM