简体   繁体   English

GCC可以使用编译时常量变量优化类的方法吗?

[英]Can GCC optimize methods of a class with compile-time constant variables?

Preamble 前言

I'm using avr-g++ for programming AVR microcontrollers and therefore I always need to get very efficient code. 我正在使用avr-g ++对AVR微控制器进行编程,因此,我始终需要获得非常高效的代码。

GCC usually can optimize a function if its argument are compile-time constants, eg I have function pin_write(uint8_t pin, bool val) which determine AVR's registers for the pin (using my special map from integer pin to a pair port/pin) and write to these registers correspondent values. 如果GCC的参数是编译时常量,通常它可以优化该函数,例如,我有函数pin_write(uint8_t pin, bool val) ,该函数确定该pin AVR寄存器(使用我从整数pin到一对端口/引脚的特殊映射),向这些寄存器写入相应的值。 This function isn't too small, because of its generality. 由于其通用性,此功能不太小。 But if I call this function with compile-time constant pin and val , GCC can make all calculations at compile-time and eliminate this call to a couple of AVR instructions, eg 但是,如果我使用编译时常量pinval调用此函数,则GCC可以在编译时进行所有计算,并消除对几个AVR指令的调用,例如

sbi PORTB,1
sbi DDRB,1

Amble 缓行

Let's write a code like this: 让我们编写这样的代码:

class A {
        int x;
public:
        A(int x_): x(x_) {}
        void foo() { pin_write(x, 1); }
};

A a(8);
int main()  {
        a.foo();
}

We have only one object of class A and it's initialized with a constant (8). 我们只有一个类A的对象,并且使用常量(8)对其进行了初始化。 So, it's possible to make all calculations at compile-time: 因此,可以在编译时进行所有计算:

foo() -> pin_write(x,1) -> pin_write(8,1) -> a couple of asm instructions

But GCC doesn't do so. 但海湾合作委员会没有这样做。

Surprisely, but if I remove global A a(8) and write just 令人惊讶的是,但是如果我删除全局A a(8)并只写

 A(8).foo()

I get exactly what I want: 我得到的正是我想要的:

00000022 <main>:
  22:   c0 9a           sbi     0x18, 0 ; 24
  24:   b8 9a           sbi     0x17, 0 ; 23

Question

So, is there a way to force GCC make all possible calculation at compile-time for single global objects with constant initializers? 因此,是否有一种方法可以强制GCC在编译时对具有常量初始化程序的单个全局对象进行所有可能的计算?

Because of this trouble I have to manually expand such cases and replace my original code with this: 由于这个麻烦,我不得不手动扩展这种情况,并用以下代码替换我的原始代码:

const int x = 8; 
class A {
public:
    A() {}
    void foo() { pin_write(x, 1); }
}

UPD. UPD。 It very wonderful: A(8).foo() inside main optimized to 2 asm instructions. 这非常美妙: main内部的A(8).foo()优化为2个asm指令。 A a(8); a.foo() A a(8); a.foo() too! A a(8); a.foo() But if I declare A a(8) as global -- compiler produce big general code. 但是,如果我A a(8)声明为全局变量,则编译器会生成大量通用代码。 I tried to add static -- it didn't help. 我试图添加static -它没有帮助。 Why? 为什么?

But if I declare A a(8) as global -- compiler produce big general code. 但是,如果我A a(8)声明为全局变量,则编译器会生成大量通用代码。 I tried to add static -- it didn't help. 我试图添加static -它没有帮助。 Why? 为什么?

In my experience, gcc is very reluctant if the object / function has external linkage. 以我的经验,如果对象/函数具有外部链接,则gcc非常不情愿。 Since we don't have your code to compile, I made a slightly modified version of your code: 由于没有您的代码可以编译,因此我对您的代码进行了稍微修改:

#include <cstdio>

class A {
        int x;
public:
        A(int x_): x(x_) {}
        int f() { return x*x; }
};

A a(8);

int main()  {
        printf("%d", a.f());
}

I have found 2 ways to achive that the generated assembly corresponds to this: 我发现有两种方法可以使生成的程序集与此相对应:

int main()  {
        printf("%d", 64);
}

In words: to eliminate everything at compile time so that only the necessary minimum remains. 换句话说:在编译时消除所有内容,以便仅保留必要的最小值。

One way to achive this both with clang and gcc is: 用clang和gcc都可以达到这一目的的一种方法是:

#include <cstdio>

class A {
        int x;
public:
        constexpr A(int x_): x(x_) {}
        constexpr int f() const { return x*x; }
};

constexpr A a(8);

int main()  {
        printf("%d", a.f());
}

gcc 4.7.2 already eliminates everything at -O1 , clang 3.5 trunk needs -O2 . gcc 4.7.2已经消除了-O1所有内容,clang 3.5中继需要-O2

Another way to achieve this is: 实现此目的的另一种方法是:

#include <cstdio>

class A {
        int x;
public:
        A(int x_): x(x_) {}
        int f() const { return x*x; }
};

static const A a(8);

int main()  {
        printf("%d", a.f());
}

It only works with clang at -O3 . 它仅适用于-O3 clang。 Apparently the constant folding in gcc is not that aggressive. 显然,gcc 不断折叠并不那么积极。 (As clang shows, it can be done but gcc 4.7.2 did not implement it.) (如clang所示,可以完成,但gcc 4.7.2并未实现。)

You can force the compiler to fully optimize the function with all known constants by changing the pin_write function into a template. 您可以通过将pin_write函数更改为模板来强制编译器使用所有已知常量完全优化该函数。 I don't know if the particular behavior is guaranteed by the standard though. 我不知道该特定行为是否由标准来保证。

template< int a, int b >
void pin_write() { some_instructions; }

This will probably require fixing all lines where pin_write is used. 这可能需要修复所有使用pin_write的行。

Additionally, you can declare the function as inline. 此外,您可以将函数声明为内联。 The compiler isn't guaranteed to inline the function (the inline keyword is just an hint), but if it does, it has a greater chance to optimize compile time constants away (assuming the compiler can know it is an compile time constant, which may be not always the case). 不能保证编译器会内联函数(inline关键字仅是一个提示),但是如果这样做,则有更大的机会优化编译时间常数(假设编译器可以知道这是一个编译时间常数,可能并非总是如此)。

Your a has external linkage, so the compiler can't be sure that there isn't other code somewhere modifying it. 您的a具有外部链接,因此编译器无法确定在其他地方没有其他代码对其进行修改。

If you were to declare a const then you make clear it shouldn't change, and also stop it having external linkage; 如果要声明a const,则应明确指出它不应更改,并且还应使其停止具有外部链接; both of those should help the compiler to be less pessimistic. 这两个都应有助于编译器减少悲观情绪。

(I'd probably declare x const too - it may not help here, but if nothing else it makes it clear to the compiler and the next reader of the code that you never change it.) (我也可能会声明x const -在这里可能无济于事,但如果没有其他规定,它会使编译器和下一个代码阅读器清楚地知道,您永远不会更改它。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM