[英]Is it a gcc -O2 optimization bug (different result from -O1)?
I write a very simple program, it behaves normal without -O2
:我写了一个非常简单的程序,它在没有
-O2
的情况下表现正常:
#include <stdio.h>
#include <stdint.h>
int main()
{
uint32_t A[4] = { 1, 2, 3, 4 };
float B[4] = { 0, 0, 0, 0 };
float C[4] = { 5, 6, 7, 8 };
int i;
// convert integer A to float B
for (i = 0; i < 4; i++)
B[i] = (float)A[i];
// memory copy from B to C
uint32_t *src = (uint32_t*)(B);
uint32_t *dst = (uint32_t*)(C);
dst[0] = src[0];
dst[1] = src[1];
dst[2] = src[2];
dst[3] = src[3];
#if 0
// open this to correct the error
__asm__("":::"memory");
#endif
// print C, C should be [1.0, 2.0, 3.0, 4.0]
for (i = 0; i < 4; i++)
printf("%f\n", C[i]);
return 0;
}
Compile without -O2
:不带
-O2
编译:
$ gcc error.c -o error
$ ./error
1.0000
2.0000
3.0000
4.0000
It works as expected.它按预期工作。 But if I added a
-O2
:但是,如果我添加了
-O2
:
$ gcc -O2 error.c -o error
$ ./error
-6169930235904.000000
0.000000
-6169804406784.000000
0.000000
In addition, if you switch #if 0
to #if 1
, it works correctly again.此外,如果您将
#if 0
切换为#if 1
,它会再次正常工作。 The asm ("":::"memory")
should be unecessary in the same thread. asm ("":::"memory")
在同一个线程中应该是不必要的。
Is it a bug of -O2
optimization??它是
-O2
优化的错误吗?
Is there any thing I can tell the compiler to care of it??有什么我可以告诉编译器照顾它的吗? I have a function to store xmm register to a (void*) pointer, like:
我有一个 function 将 xmm 寄存器存储到 (void*) 指针,例如:
inline void StoreRegister(void *ptr, const __m128& reg)
{
#if DONT_HAVE_SSE
const uint32_t *src = reinterpret_cast<const uint32_t*>(®);
uint32_t *dst = reinterpret_cast<uint32_t*>(ptr);
dst[0] = src[0];
dst[1] = src[1];
dst[2] = src[2];
dst[3] = src[3];
#else
_mm_storeu_si128(reinterpret_cast<__m128*>(ptr), _mm_castps_si128(reg));
#endif
}
The dst
is the C
in the code above, any way to make it correct without modifying the function signature. dst
是上面代码中的C
,任何在不修改 function 签名的情况下使其正确的方法。
No this is not a manifestation of a compiler bug.不,这不是编译器错误的表现。
Rather the behaviour of your code is undefined due to your using the result of the cast (uint32_t*)(B)
&c.相反,由于您使用强制转换
(uint32_t*)(B)
&c 的结果,您的代码的行为是未定义的。 This is a violation of strict aliasing .这违反了严格的别名。
Compilers - particularly gcc - are becoming more and more aggressive when it comes to treating undefined constructs.编译器——尤其是 gcc——在处理未定义的结构时变得越来越激进。 They are allowed by the standard to assume that undefined behaviour does not occur, and can remove any branch that contains it.
标准允许它们假设未定义的行为不会发生,并且可以删除包含它的任何分支。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.