简体   繁体   English

是否允许这种浮点优化?

[英]Is this floating-point optimization allowed?

I tried to check out where float loses the ability to exactly represent large integer numbers. 我试图检查float失去了准确表示大整数的能力。 So I wrote this little snippet: 所以我写了这个小片段:

int main() {
    for (int i=0; ; i++) {
        if ((float)i!=i) {
            return i;
        }
    }
}

This code seems to work with all compilers, except clang. 这段代码似乎适用于除clang之外的所有编译器。 Clang generates a simple infinite loop. Clang生成一个简单的无限循环。 Godbolt . Godbolt

Is this allowed? 这是允许的吗? If yes, is it a QoI issue? 如果是,那是否是QoI问题?

Note that the built-in operator != requires its operands to be of the same type, and will achieve that using promotions and conversions if necessary. 请注意,内置运算符!=要求其操作数具有相同的类型,并且如果需要,将使用促销和转换实现该操作数。 In other words, your condition is equivalent to: 换句话说,您的条件相当于:

(float)i != (float)i

That should never fail, and so the code will eventually overflow i , giving your program Undefined Behaviour. 这应该永远不会失败,所以代码最终会溢出i ,给你的程序Undefined Behavior。 Any behaviour is therefore possible. 因此任何行为都是可能的。

To correctly check what you want to check, you should cast the result back to int : 要正确检查要检查的内容,应将结果转换回int

if ((int)(float)i != i)

As @Angew pointed out , the != operator needs the same type on both sides. 正如@Angew指出的那样!=运算符在两侧都需要相同的类型。 (float)i != i results in promotion of the RHS to float as well, so we have (float)i != (float)i . (float)i != i导致RHS的推广也浮动,所以我们有(float)i != (float)i


g++ also generates an infinite loop, but it doesn't optimize away the work from inside it. g ++也会生成一个无限循环,但它并没有优化它内部的工作。 You can see it converts int->float with cvtsi2ss and does ucomiss xmm0,xmm0 to compare (float)i with itself. 你可以看到它用cvtsi2ss转换int-> float并且ucomiss xmm0,xmm0来比较(float)i和它自己。 (That was your first clue that your C++ source doesn't mean what you thought it did like @Angew's answer explains.) (这是你的第一个线索,你的C ++源代码并不意味着你的想法,就像@Angew的答案所解释的那样。)

x != x is only true when it's "unordered" because x was NaN. x != x仅在“无序”时才为真,因为x为NaN。 ( INFINITY compares equal to itself in IEEE math, but NaN doesn't. NAN == NAN is false, NAN != NAN is true). INFINITY在IEEE数学中与自身相等,但NaN没有NAN == NAN为假, NAN != NAN为真)。

gcc7.4 and older correctly optimizes your code to jnp as the loop branch ( https://godbolt.org/z/fyOhW1 ) : keep looping as long as the operands to x != x weren't NaN. gcc7.4和更早版本正确地将你的代码优化为jnp作为循环分支( https://godbolt.org/z/fyOhW1 ):只要x != x的操作数不是NaN,就保持循环。 (gcc8 and later also checks je to a break out of the loop, failing to optimize based on the fact that it will always be true for any non-NaN input). (gcc8以及后来还会检查je是否突破循环,未能根据任何非NaN输入始终为真的事实进行优化)。 x86 FP compares set PF on unordered. x86 FP比较无序的设置PF。


And BTW, that means clang's optimization is also safe : it just has to CSE (float)i != (implicit conversion to float)i as being the same, and prove that i -> float is never NaN for the possible range of int . 而顺便说一下,这意味着clang的优化也是安全的 :它只需要CSE (float)i != (implicit conversion to float)i是一样的,并且证明i -> float对于int的可能范围永远不会是NaN 。

(Although given that this loop will hit signed-overflow UB, it's allowed to emit literally any asm it wants, including a ud2 illegal instruction, or an empty infinite loop regardless of what the loop body actually was.) But ignoring the signed-overflow UB, this optimization is still 100% legal. (虽然假设这个循环会遇到有符号溢出的UB,但它允许按字面意思发出它想要的任何asm,包括一个ud2非法指令,或者一个空的无限循环,无论循环体实际上是什么。)但忽略了有符号溢出UB,这种优化仍然100%合法。


GCC fails to optimize away the loop body even with -fwrapv to make signed-integer overflow well-defined (as 2's complement wraparound). 即使使用-fwrapv GCC也无法优化掉循环体,以使有符号整数溢出定义明确 (作为2的补码环绕)。 https://godbolt.org/z/t9A8t_ https://godbolt.org/z/t9A8t_

Even enabling -fno-trapping-math doesn't help. 即使启用-fno-trapping-math也无济于事。 (GCC's default is unfortunately to enable 不幸的是,GCC的默认设置启用了
-ftrapping-math even though GCC's implementation of it is broken/buggy .) int->float conversion can cause an FP inexact exception (for numbers too large to be represented exactly), so with exceptions possibly unmasked it's reasonable not to optimize away the loop body. -ftrapping-math即使GCC的实现被破坏/错误 。)int-> float转换可能导致FP不精确的异常(对于数字太大而无法准确表示),因此,如果异常可能未被掩盖,则不合理地优化掉循环体。 (Because converting 16777217 to float could have an observable side-effect if the inexact exception is unmasked.) (因为如果不屏蔽异常, 16777217转换为float可能会产生可观察到的副作用。)

But with -O3 -fwrapv -fno-trapping-math , it's 100% missed optimization not to compile this to an empty infinite loop. 但是使用-O3 -fwrapv -fno-trapping-math ,它100%错过了优化,而不是将其编译为空的无限循环。 Without #pragma STDC FENV_ACCESS ON , the state of the sticky flags that record masked FP exceptions is not an observable side-effect of the code. 如果没有#pragma STDC FENV_ACCESS ON ,则记录屏蔽FP异常的粘性标记的状态不是代码的可观察副作用。 No int -> float conversion can result in NaN, so x != x can't be true. 没有int - > float转换会导致NaN,所以x != x不能为true。


These compilers are all optimizing for C++ implementations that use IEEE 754 single-precision (binary32) float and 32-bit int . 这些编译器都针对使用IEEE 754单精度(binary32) float和32位int C ++实现进行了优化。

The bugfixed (int)(float)i != i loop would have UB on C++ implementations with narrow 16-bit int and/or wider float , because you'd hit signed-integer overflow UB before reaching the first integer that wasn't exactly representable as a float . bugfixed (int)(float)i != i循环在C ++实现上具有UB,具有窄的16位int和/或更宽的float ,因为在到达第一个不是第一个整数之前你会遇到有符号整数溢出UB完全可以表示为float

But UB under a different set of implementation-defined choices doesn't have any negative consequences when compiling for an implementation like gcc or clang with the x86-64 System V ABI. 但是,在使用x86-64 System V ABI编译gcc或clang等实现时,UB在一组不同的实现定义选择下没有任何负面影响。


BTW, you could statically calculate the result of this loop from FLT_RADIX and FLT_MANT_DIG , defined in <climits> . 顺便说一句,您可以静态计算此循环的结果来自FLT_RADIXFLT_MANT_DIG ,在<climits>定义。 Or at least you can in theory, if float actually fits the model of an IEEE float rather than some other kind of real-number representation like a Posit / unum. 或者至少你可以在理论上,如果float实际上适合IEEE浮点数的模型,而不是像Posit / unum那样的其他类型的实数表示。

I'm not sure how much the ISO C++ standard nails down about float behaviour and whether a format that wasn't based on fixed-width exponent and significand fields would be standards compliant. 我不确定ISO C ++标准对float行为的重视程度以及不基于固定宽度指数和有效数字字段的格式是否符合标准。


In comments: 在评论中:

@geza I would be interested to hear the resulting number! @geza我很想听到结果号码!

@nada: it's 16777216 @nada:这是16777216

Are you claiming you got this loop to print / return 16777216 ? 你声称你有这个循环打印/返回16777216

Update: since that comment has been deleted, I think not. 更新:由于该评论已删除,我认为不是。 Probably the OP is just quoting the float before the first integer that can't be exactly represented as a 32-bit float . 可能OP只是在第一个整数之前引用float ,它不能完全表示为32位float https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Precision_limits_on_integer_values ie what they were hoping to verify with this buggy code. https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Precision_limits_on_integer_values即他们希望用这个错误代码验证的内容。

The bugfixed version would of course print 16777217 , the first integer that's not exactly representable, rather than the value before that. bugfixed版本当然会打印16777217 ,这是第一个完全可表示的整数,而不是之前的值。

(All the higher float values are exact integers, but they're multiples of 2, then 4, then 8, etc. for exponent values higher than the significand width. Many higher integer values can be represented, but 1 unit in the last place (of the significand) is greater than 1 so they're not contiguous integers. The largest finite float is just below 2^128, which is too large for even int64_t .) (所有较高的浮点值都是精确整数,但是对于高于有效数字宽度的指数值,它们是2的倍数,然后是4,然后是8等。可以表示许多更高的整数值,但最后一个位置是1个单位(有效数字)大于1所以它们不是连续的整数。最大的有限float低于2 ^ 128,这对于偶数int64_t来说太大了。)

If any compiler did exit the original loop and print that, it would be a compiler bug. 如果任何编译器确实退出原始循环并打印它,那将是编译器错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM