严格的别名违规：为什么 gcc 和 clang 生成不同的输出？

Question

When the typecasting violates the strict aliasing rule in C and C++, a compiler may optimize in such a way that wrong constant value can be propagated and unaligned access could be allowed, which results in performance degradation or bus errors.当类型转换违反 C 和 C++ 中的严格别名规则时，编译器可能会以这样一种方式进行优化，即可以传播错误的常量值并允许未对齐的访问，这会导致性能下降或总线错误。

I wrote a simple example to see how the compiler optimize the constant when I violate the strict aliasing rule in GCC & Clang.我写了一个简单的例子，看看当我违反 GCC & Clang 中的严格别名规则时，编译器如何优化常量。

Here is the code and instructions that I got.这是我得到的代码和说明。

#include <stdio.h>
#include <stdlib.h>

int
foo () //different result in C and C++
{
    int x = 1;
    long *fp = (long *)&x;
    *fp = 1234L;

    return x;
}

//int and long are not compatible 
//Wrong constant propagation as a result of strict aliasing violation
long
bar(int *ip, long *lp)
{
    *lp = 20L;
    *ip = 10;

    return *lp;
}

//char is always compatible with others
//constant is not propagated and memory is read
char
car(char *cp, long *lp)
{
    *cp = 'a';
    *lp = 10L;
    return *cp;
}

When I compile the code with the GCC 8.2 with -std=c11 -O3 option.当我使用带有 -std=c11 -O3 选项的 GCC 8.2 编译代码时。

foo:
  movl $1234, %eax
  ret
bar:
  movq $20, (%rsi)
  movl $20, %eax
  movl $10, (%rdi)
  ret
car:
  movb $97, (%rdi)
  movq $10, (%rsi)
  movzbl (%rdi), %eax
  ret

When I compile the code with the clang 7.0 with -std=c11 -O3 option.当我使用带有 -std=c11 -O3 选项的 clang 7.0 编译代码时。

foo: # @foo
  movl $1, %eax
  retq
bar: # @bar
  movq $20, (%rsi)
  movl $10, (%rdi)
  movl $20, %eax
  retq
car: # @car
  movb $97, (%rdi)
  movq $10, (%rsi)
  movb (%rdi), %al
  retq

bar and car function generate almost same instruction sequences and the return values are same in both case; bar 和 car 函数生成几乎相同的指令序列，并且在两种情况下返回值相同； bar violates the rule, and constant is propagated; bar违反规则，常量被传播； and car doesn't violate and the correct value is read from the memory.并且汽车没有违反并且从内存中读取正确的值。

However, for the foo function which violates the strict aliasing rule generate different output output in GCC and Clang;但是，对于违反严格别名规则的 foo 函数，在 GCC 和 Clang 中生成不同的输出输出； gcc propagates the correct value stored in the memory (but not with the memory reference), and clang propagates a wrong value. gcc 传播存储在内存中的正确值（但不包含内存引用），而 clang 传播错误值。 It seems that two compilers both apply the constant propagation as its optimization, but why two compilers generate a different result?似乎两个编译器都应用常量传播作为其优化，但是为什么两个编译器生成不同的结果？ Is it mean that GCC automatically finds out strict aliasing violation in the foo function and propagate the correct value?这是否意味着 GCC 会自动找出 foo 函数中的严格别名违规并传播正确的值？

Why they show different instruction streams and result?为什么它们显示不同的指令流和结果？

Answer 1

Why can we say the bar doesn't violate the strict aliasing rule?为什么我们可以说 bar 没有违反严格的别名规则？

If the code that calls bar does not violate strict aliasing, bar will not violate strict aliasing either.如果调用 bar 的代码不违反严格别名，则 bar 也不会违反严格别名。

Let me give an example.让我举个例子吧。

Suppose we call bar like this:假设我们这样调用 bar：

int x;
long y;
bar(&x, &y);

Strict aliasing requires that two pointers of different types do not refer to the same memory.严格别名要求两个不同类型的指针不指向同一个内存。 &x and &y are different types, and they refer to different memory. &x 和 &y 是不同的类型，它们指的是不同的内存。 This does not violate strict aliasing.这不违反严格的别名。

On the other hand, let's say we call it like this:另一方面，假设我们这样称呼它：

long y;
bar((int *) &y, &y);

Now we've violated strict aliasing.现在我们违反了严格的别名。 However, the violation is the caller's fault.但是，违规是调用者的错。

严格的别名违规：为什么 gcc 和 clang 生成不同的输出？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-19 22:23:10

严格的别名违规：为什么 gcc 和 clang 生成不同的输出？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-19 22:23:10

解决方案1
1 已采纳 2019-01-19 22:23:10