简体   繁体   English

基于整数溢出的GCC优化

[英]GCC optimizations based on integer overflow

Recently I had a discussion about someone who wanted to check for signed int overflow like this if (A + B < 2 * max(A, B)) . 最近,我有一个关于有人想要检查带符号的int溢出的讨论,例如if (A + B < 2 * max(A, B)) Lets ignore for a second that the logic itself is wrong and discuss signed integer overflow in context of C/C++. 让我们先忽略一下逻辑本身是错误的,然后在C / C ++的上下文中讨论有符号整数溢出。 (Which I believe fully inherits this part of standard from C). (我相信这完全是从C继承了标准的这一部分)。

What kinds of check that need signed integer overflow will be optimized away by current-ish GCC and which won't? 当前的GCC将优化哪些需要带符号整数溢出的检查,而哪些不会呢?

Since the original text wasn't all that well formulated and apparently controversial I decided to change the question somewhat, but leave the original text below. 由于原始文本的格式不够好,并且似乎存在争议,因此我决定对问题进行一些更改,但将原始文本保留在下面。

All examples used below were tested gcc version 4.7.2 (Debian 4.7.2-5) and compiled using -O3 下面使用的所有示例均已通过gcc版本4.7.2 (Debian 4.7.2-5)并使用-O3编译

Namely, it is undefined and GCC infamously uses this to perform some branch simplifications. 即,它是未定义的,并且GCC臭名昭著地使用它来执行一些分支简化。 The first example of this that comes to mind is 我想到的第一个例子是

int i = 1;
while (i > 0){
    i *= 2;
}

which produces an infinite loop. 产生无限循环。 Another case where this kind of optimalization kicks in is 这种优化开始的另一种情况是

if (A + 2 < A){
    /* Handle potential overflow */
}

where, assuming A is signed integral type, the overflow branch gets completely removed. 假设A是带符号整数类型,则溢出分支将被完全删除。

Even more interestingly, some cases of easily provable integer overflow, are left untouched, such as 更有趣的是,一些容易证明的整数溢出的情况没有受到影响,例如

if (INT_MAX + 1 < 0){
    /* You wouldn't write this explicitly, but after static analysis the program
       could be shown to contain something like this. */
}

which triggers the branch that you would expect with two's complement representation. 这会触发以二进制补码表示的分支。 Similarly, this code leaves the conditional branches intact 同样,此代码使条件分支保持不变

int C = abs(A);
if (A + C < 0){
    /* For this to be hit, overflow or underflow had to happen. */
}

Now for the question, is there a pattern that looks roughly like if (A + B < C) or if (A + B < c) , that will be optimized away? 现在,对于这个问题,是否存在一种看起来像if (A + B < C)if (A + B < c)将被优化掉? When I was googling around before writing this, it seemed like the last snippet should be optimized away, but I cannot reproduce this kind of error in an overflow check that doesn't operate with constant explicitly. 当我在写这篇文章之前四处搜寻时,似乎应该对最后一个代码片段进行优化,但是我无法在没有显式使用常量的溢出检查中重现这种错误。

Many compilers will replace expressions involving signed integers or pointers with "false", like 许多编译器会将涉及带符号整数或指针的表达式替换为“ false”,例如

a + 1 < a // signed integer a
p + 1 < p // Pointer p

when the expression can only be true in the case of undefined behaviour. 当表达式仅在未定义行为的情况下可以为真时。 On the other hand, that allows 另一方面,这允许

for (char* q = p; q < p + 2; ++q) ...

to be inlined, substituting q = p and q = p + 1, without any check, so that's a good thing. 内联,不用任何检查就用q = p和q = p + 1代替,所以这是一件好事。

if (A + abs (A) < 0)

is probably too complicated for many compilers. 对于许多编译器来说可能太复杂了。 Note that for unsigned integers there is no undefined behaviour. 请注意,对于无符号整数,没有未定义的行为。 As a consequence, loops using unsigned 32 bit integers with 64 bit pointers tend to be slower than necessary because the wraparound behaviour must be taken into account. 结果,使用无符号32位整数和64位指针的循环会比必要的慢,因为必须考虑回绕行为。 For unsigned 32 bit integer and 64 bit pointers, it is possible that 对于无符号的32位整数和64位指针,可能

&p [i] > &p [i+1]

without undefined behaviour (not with 64 bit integers or 32 bit pointers). 没有未定义的行为(不带有64位整数或32位指针)。

If I may paraphrase your question, I believe that are asking something like this. 如果我可以解释您的问题,我相信这是在问这样的事情。

Does there exist a compiler that optimises signed integer expressions so aggressively that it is prepared to undertake detailed analysis of certain categories of such expressions in order to determine that a dependent condition is true (or false) throughout the range of representable values for the type of the result of the expression, and by those means delete the conditional test? 是否有一个编译器如此积极地优化有符号整数表达式,使其准备对此类表达式的某些类别进行详细分析,以便确定在可range of representable values for the type的整个可range of representable values for the type从属条件为真(或假)。表达式的结果,并以此方式删除条件测试?

The compiler you offer is a particular version of GCC, and the expressions you offer fall into a narrow range, but I assume that you would also be interested to learn of another compiler or closely-related expressions. 您提供的编译器是GCC的特定版本,并且您提供的表达式范围很窄,但是我想您也希望了解其他编译器或紧密相关的表达式。

The answer is right now I'm not aware of one, but it could be only a matter of time. 答案是现在我还不知道,但这可能只是时间问题。

Existing compilers perform premature evaluation of expressions that contain constants or certain recognisable patterns, and if during this evaluation they encounter undefined behaviour will ordinarily avoid optimising the expression. 现有的编译器会对包含常量或某些可识别模式的表达式进行过早评估,如果在此评估期间遇到不确定的行为,通常会避免优化表达式。 They are not obliged to do so. 他们没有义务这样做。

Data flow analysis is CPU and memory intensive and tends to be used where there are large benefits to be had. 数据流分析需要占用大量CPU和内存,并且倾向于在有很大好处的地方使用。 Eventually the C++ standard will stop changing (so much) and the compiler writers will have time on their hands. 最终,C ++标准将停止更改(太多),并且编译器编写者将有很多时间在手。 We're still a bit short of the day when a compiler reads a prime number sieve program and optimises it into a single print statement, but it will come. 编译器读取质数筛程序并将其优化为单个打印语句时,我们还有一天的时间,但是它会来的。

The main point of my answer is to point out that this is actually a question about compiler technology and has very little to do with the C++ standard. 我的回答的重点是指出这实际上是关于编译器技术的问题,与C ++标准关系不大。 Perhaps you should ask the GCC group directly. 也许您应该直接向GCC小组询问。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM