简体   繁体   English

C++ 标准究竟在哪里说取消引用未初始化的指针是未定义的行为?

[英]Where exactly does C++ standard say dereferencing an uninitialized pointer is undefined behavior?

So far I can't find how to deduce that the following:到目前为止,我找不到如何推断以下内容:

int* ptr;
*ptr = 0;

is undefined behavior.是未定义的行为。

First of all, there's 5.3.1/1 that states that * means indirection which converts T* to T .首先,有 5.3.1/1 声明*表示将T*转换为T的间接。 But this doesn't say anything about UB.但这并没有说明UB。

Then there's often quoted 3.7.3.2/4 saying that using deallocation function on a non-null pointer renders the pointer invalid and later usage of the invalid pointer is UB.然后经常引用 3.7.3.2/4 说,在非空指针上使用释放函数会使指针无效,然后无效指针的使用是 UB。 But in the code above there's nothing about deallocation.但在上面的代码中,没有关于释放的内容。

How can UB be deduced in the code above?上面的代码中如何推导出UB?

Section 4.1 looks like a candidate ( emphasis mine ): 第4.1节看起来像一个候选人( 重点是 ):

An lvalue (3.10) of a non-function, non-array type T can be converted to an rvalue. 非函数,非数组类型T的左值(3.10)可以转换为右值。 If T is an incomplete type, a program that necessitates this conversion is ill-formed. 如果T是不完整的类型,则必须进行这种转换的程序格式错误。 If the object to which the lvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized , a program that necessitates this conversion has undefined behavior . 如果左值引用的对象不是类型T的对象,也不是从T派生的类型的对象 ,或者该对象未初始化 ,则需要进行此转换的程序具有未定义的行为 If T is a non-class type, the type of the rvalue is the cv-unqualified version of T. Otherwise, the type of the rvalue is T. 如果T为非类类型,则右值的类型为T的cv不合格版本。否则,右值的类型为T。

I'm sure just searching on "uninitial" in the spec can find you more candidates. 我敢肯定,只要在规范中搜索“ uninitial”,就能找到更多候选人。

I found the answer to this question is a unexpected corner of the C++ draft standard , section 24.2 Iterator requirements , specifically section 24.2.1 In general paragraph 5 and 10 which respectively say ( emphasis mine ): 我发现这个问题的答案是C ++标准草案24.2节“ 迭代器要求” (特别是第24.2.1节)中意料之外的一个方面, 5段和第10段中分别说( 强调我的 ):

[...][ Example: After the declaration of an uninitialized pointer x (as with int* x;), x must always be assumed to have a singular value of a pointer . [...] [示例:在声明未初始化的指针 x之后(与int * x;一样), 必须始终假定 x 具有指针的奇异值 —end example ] [...] Dereferenceable values are always non-singular. [-end示例] [...]可引用的值始终不是单数。

and: 和:

An invalid iterator is an iterator that may be singular. 无效的迭代器是可能为单数的迭代器。 268 268

and footnote 268 says: 脚注268说:

This definition applies to pointers, since pointers are iterators . 该定义适用于指针, 因为指针是迭代器 The effect of dereferencing an iterator that has been invalidated is undefined. 取消引用已经无效的迭代器的效果是不确定的。

Although it does look like there is some controversy over whether a null pointer is singular or not and it looks like the term singular value needs to be properly defined in a more general manner. 尽管看起来确实存在关于空指针是否为奇数的争议,并且看起来术语“ 奇异值”需要以更一般的方式正确定义。

The intent of singular is seems to be summed up well in defect report 278. What does iterator validity mean? 缺陷报告278中似乎很好地概括了单数 含义。迭代器有效性是什么意思? under the rationale section which says: 在“基本原理”部分下显示:

Why do we say "may be singular" , instead of "is singular"? 为什么我们说“可能是单数”而不是“是单数”? That's becuase a valid iterator is one that is known to be nonsingular . 那是因为有效的迭代器是已知的非奇异迭代器 Invalidating an iterator means changing it in such a way that it's no longer known to be nonsingular. 使迭代器无效意味着以一种不再不再是单数的方式更改它。 An example: inserting an element into the middle of a vector is correctly said to invalidate all iterators pointing into the vector. 一个例子:正确地将元素插入向量的中间会使所有指向向量的迭代器无效。 That doesn't necessarily mean they all become singular . 不一定意味着他们都变得单数

So invalidation and being uninitialized may create a value that is singular but since we can not prove they are nonsingular we must assume they are singular . 因此, 失效未初始化 may创建一个奇异的值,但是由于我们无法证明它们是非奇异的 ,因此必须假定它们是奇异的

Update 更新资料

An alternative common sense approach would be to note that the draft standard section 5.3.1 Unary operators paragraph 1 which says( emphasis mine ): 另一种常识性方法是要注意,标准草案第5.3.1一元运算符1段( 强调我的 ):

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.[...] 一元*运算符执行间接操作:应用该表达式的表达式应为指向对象类型的指针或为函数类型的指针,并且结果为指向表达式所指向的对象或函数的左值 。 ..]

and if we then go to section 3.10 Lvalues and rvalues paragraph 1 says( emphasis mine ): 然后如果转到第3.10节“ 左值和右值”,则1段说( 强调我的意思 ):

An lvalue (so called, historically, because lvalues could appear on the left-hand side of an assignment expression) designates a function or an object. 左值(之所以称为历史值,是因为左值可能出现在赋值表达式的左侧)指定函数或对象。 [...] [...]

but ptr will not, except by chance, point to a valid object . 但是ptr除非偶然,否则不会指向有效的对象

The OP's question is nonsense. OP的问题是胡说八道。 There is no requirement that the Standard say certain behaviours are undefined, and indeed I would argue that all such wording be removed from the Standard because it confuses people and makes the Standard more verbose than necessary. 并没有要求标准说某些行为是未定义的,确实,我会主张将所有此类措辞从标准中删除,因为它会使人们感到困惑,并使标准更加冗长。

The Standard defines certain behaviour. 该标准定义了某些行为。 The question is, does it specify any behaviour in this case? 问题是,在这种情况下,它是否指定了任何行为? If it does not, the behaviour is undefined whether or not it says so explicitly. 如果不是,则无论行为是否明确表示,行为都是不确定的。

In fact the specification that some things are undefined is left in the Standard primarily as a debugging aid for the Standards writers, the idea being to generate a contradiction if there is a requirement in one place which conflicts with an explicit statement of undefined behaviour in another: that's a way to prove a defect in the Standard. 实际上,某些未定义的规范主要留在标准中,作为标准编写者的调试辅助工具,其思想是,如果某个地方的要求与另一地方的未定义行为的明确陈述相冲突,则会产生矛盾。 :这是证明标准存在缺陷的一种方式。 Without the explicit statement of undefined behaviour, the other clause prescribing behaviour would be normative and unchallenged. 如果没有明确声明未定义的行为,则其他规定行为的条款将是规范性的且不受挑战的。

Evaluating an uninitialized pointer causes undefined behaviour. 评估未初始化的指针会导致未定义的行为。 Since dereferencing the pointer first requires evaluating it, this implies that dereferencing also causes undefined behaviour. 由于取消引用指针首先需要对其进行评估,因此这意味着取消引用还会导致未定义的行为。

This was true in both C++11 and C++14, although the wording changed. 尽管措辞有所变化,但在C ++ 11和C ++ 14中都是如此。

In C++14 it is fully covered by [dcl.init]/12: 在C ++ 14中,[dcl.init] / 12完全覆盖了它:

When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced. 当获得具有自动或动态存储持续时间的对象的存储时,该对象具有不确定的值,并且如果未对该对象执行任何初始化,则该对象将保留不确定的值,直到替换该值为止。

If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases: 如果评估产生不确定的值,则该行为是不确定的,但以下情况除外:

where the "following cases" are particular operations on unsigned char . 其中“以下情况”是对unsigned char特定操作。


In C++11, [conv.lval/2] covered this under the lvalue-to-rvalue conversion procedure (ie retrieving the pointer value from the storage area denoted by ptr ): 在C ++ 11中,[conv.lval / 2]在左值到右值转换过程(即,从以ptr表示的存储区域中检索指针值)下进行了介绍:

A glvalue of a non-function, non-array type T can be converted to a prvalue. 非函数,非数组类型T的glvalue可以转换为prvalue。 If T is an incomplete type, a program that necessitates this conversion is ill-formed. 如果T是不完整的类型,则必须进行这种转换的程序格式错误。 If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior. 如果glvalue引用的对象不是类型T的对象,也不是从T派生的类型的对象或者该对象未初始化,则需要进行此转换的程序将具有未定义的行为。

The bolded part was removed for C++14 and replaced with the extra text in [dcl.init/12]. 对于C ++ 14,删除了粗体部分,并用[dcl.init / 12]中的多余文本代替。

I'm not going to pretend I know a lot about this, but some compilers would initialize the pointer to NULL and dereferencing a pointer to NULL is UB. 我不会假装对此了解很多,但是某些编译器会初始化指向NULL的指针,而取消引用指向NULL的指针就是UB。

Also considering that uninitialized pointer could point to anything (this includes NULL) you could concluded that it's UB when you dereference it. 同样考虑到未初始化的指针可能指向任何东西(包括NULL),您可以在取消引用它时得出它是UB的结论。

A note in section 8.3.2 [dcl.ref] 第8.3.2节[dcl.ref]中的注释

[Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior . [注:尤其是,空引用不能存在于定义良好的程序中,因为创建此类引用的唯一方法是将其绑定到通过解引用空指针而获得的“对象” ,这会导致未定义的行为 As described in 9.6, a reference cannot be bound directly to a bitfield. 如9.6中所述,引用不能直接绑定到位域。 ] ]

—ISO/IEC 14882:1998(E), the ISO C++ standard, in section 8.3.2 [dcl.ref] — ISO / IEC 14882:1998(E),ISO C ++标准,第8.3.2节[dcl.ref]

I think I should have written this as comment instead, I'm not really that sure. 我想我应该把它写为评论,但我不确定。

To dereference the pointer, you need to read from the pointer variable (not talking about the object it points to). 要取消引用指针,您需要从指针变量中读取(不要谈论它指向的对象)。 Reading from an uninitialized variable is undefined behaviour. 从未初始化的变量读取是未定义的行为。

What you do with the value of pointer after you have read it, doesn't matter anymore at this point, be it writing to (like in your example) or reading from the object it points to. 读取指针后,对指针的值执行的操作现在不再重要,无论是写入(如您的示例中)还是从其指向的对象读取。

Even if the normal storage of something in memory would have no "room" for any trap bits or trap representations, implementations are not required to store automatic variables the same way as static-duration variables except when there is a possibility that user code might hold a pointer to them somewhere. 即使在内存中正常存储某些内容也不留任何陷阱位或陷阱表示的“空间”,也不需要实现以与静态持续时间变量相同的方式存储自动变量,除非有可能用户代码可能保留指向他们某个地方的指针。 This behavior is most visible with integer types. 此行为在整数类型中最明显。 On a typical 32-bit system, given the code: 在典型的32位系统上,给出以下代码:

uint16_t foo(void);
uint16_t bar(void);
uint16_t blah(uint32_t q)
{
  uint16_t a;
  if (q & 1) a=foo();
  if (q & 2) a=bar();
  return a;
}
unsigned short test(void)
{
  return blah(65540);
}

it would not be particularly surprising for test to yield 65540 even though that value is outside the representable range of uint16_t , a type which has no trap representations. 即使该值超出uint16_t的可表示范围(没有陷阱表示形式的类型),对于test产生65540的test也就不足为奇了。 If a local variable of type uint16_t holds Indeterminate Value, there is no requirement that reading it yield a value within the range of uint16_t . 如果类型为uint16_t的局部变量具有不确定的值,则不要求读取它会产生uint16_t范围内的值。 Since unexpected behaviors could result when using even unsigned integers in such fashion, there's no reason to expect that pointers couldn't behave in even worse fashion. 由于以这种方式使用甚至无符号整数也可能导致意外行为,因此没有理由指望指针不会以更差的方式表现。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 是否取消引用等于nullptr的指针的标准未定义行为? - Is dereferencing a pointer that's equal to nullptr undefined behavior by the standard? 在C ++标准中它说的是什么:: delete可以改变左值? - Where in the C++ Standard does it say ::delete can change lvalues? 在什么时候取消引用空指针会变成未定义的行为? - At what point does dereferencing the null pointer become undefined behavior? 为什么 C++ 将未初始化的原始指针(或说守护程序)识别为真? - Why C++ Recognizes an uninitialized Raw Pointer (or say a daemon) as true? 解引用指针C ++ - Dereferencing pointer C++ C ++:取消引用指针 - C++:Dereferencing a pointer 用C ++ ISO标准解释对基址的反引用指针 - C++ ISO Standard interpretation of dereferencing pointer to base C ++标准的未定义行为段落中的[注]是什么意思? - What does the [Note] in undefined behavior paragraph in C++ standard mean? C ++标准在哪里说,C在哪里说相同:编译单元(.cpp文件)中的变量按声明顺序初始化 - Where does the C++ standard say, and does C say the same: variables in a compilation unit (.cpp file) are initialised in order of declaration 在C ++中,访问未初始化的数组是未指定的行为还是未定义的行为? - In C++, is accessing an uninitialized array unspecified behavior or undefined behavior?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM