[英]“Static const” vs “#define” for efficiency in C
I was recently wondering what the difference between #define
and static const
is in C and why two methods exist to do the same things. 我最近想知道之间的区别是什么
#define
和static const
在C,为什么两种方法存在做同样的事情。 I found some people that had similar questions here: 我发现有些人在这里有类似的问题:
Advantage and disadvantages of #define vs. constants? #define与常量的优缺点?
"static const" vs "#define" vs "enum" “static const”vs“#define”vs“enum”
Lots of people talk about best practice and convention as well as give practical reasons for using one over the other, such as the need to pass a pointer to a constant, which I can do with a static const
but not with a #define
. 很多人都在讨论最佳实践和约定,并提供使用其中一个的实际原因,例如需要将指针传递给常量,我可以使用
static const
但不能使用#define
。 However I have yet to find anyone talk about a comparison of the efficiency of the two. 但是我还没有找到任何人谈论两者效率的比较。
From what I understand about the C preprocessor, if I have a statement like this: 根据我对C预处理器的理解,如果我有这样的声明:
#define CONSTANT 6
I create a constant value that can be used like this 我创建一个可以像这样使用的常量值
char[CONSTANT]
which will actually be replaced with this statement char[6]
prior to actually being compiled. char[CONSTANT]
在实际编译之前实际上将被替换为此语句char[6]
。
This to me seems like it would be more efficient than using a static const constant = 6;
这对我来说似乎比使用
static const constant = 6;
更有效static const constant = 6;
because this would create a variable called constant that would live on the stack which I assume would come with some more baggage than a #define
. 因为这会创建一个名为constant的变量,它将存在于堆栈中,我认为它会带来比
#define
更多的包袱。 Assuming I need a constant in a situation where I could choose to use either a preprocessor #define
or a static const
statement with no obvious reasons to choose one over the other, which is more efficient? 假设我需要一个常量,我可以选择使用预处理器
#define
或static const
语句而没有明显的理由选择一个而不是另一个,哪个更有效? And how exactly would I go about testing this myself? 我将如何自己测试呢?
Consider the following 2 test files 考虑以下2个测试文件
Test1.c : Uses static const foo. Test1.c :使用静态const foo。
// Test1.c uses static const..
#include <stdio.h>
static const foo = 6;
int main() {
printf("%d", foo);
return 0;
}
Test2.c: uses macro. Test2.c:使用宏。
// Test2.c uses macro..
#include <stdio.h>
#define foo 6
int main() {
printf("%d", foo);
return 0;
}
and the corresponding assembly equivalences when using gcc -O0
(default) are follows, 使用
gcc -O0
(默认)时相应的程序集等价如下,
Assembly for Test1.c: Test1.c的程序集:
0000000000000000 <main>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: 48 83 ec 20 sub rsp,0x20
8: e8 00 00 00 00 call d <main+0xd>
d: b8 06 00 00 00 mov eax,0x6
12: 89 c2 mov edx,eax
14: 48 8d 0d 04 00 00 00 lea rcx,[rip+0x4] # 1f <main+0x1f>
1b: e8 00 00 00 00 call 20 <main+0x20>
20: b8 00 00 00 00 mov eax,0x0
25: 48 83 c4 20 add rsp,0x20
29: 5d pop rbp
2a: c3 ret
2b: 90 nop
Assembly for Test2.c: Test2.c的程序集:
0000000000000000 <main>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: 48 83 ec 20 sub rsp,0x20
8: e8 00 00 00 00 call d <main+0xd>
d: ba 06 00 00 00 mov edx,0x6
12: 48 8d 0d 00 00 00 00 lea rcx,[rip+0x0] # 19 <main+0x19>
19: e8 00 00 00 00 call 1e <main+0x1e>
1e: b8 00 00 00 00 mov eax,0x0
23: 48 83 c4 20 add rsp,0x20
27: 5d pop rbp
28: c3 ret
29: 90 nop
In both the cases, it is not using external memory. 在这两种情况下,都没有使用外部存储器。 But the difference is that,
#define
replaces foo
by the value, static const
is an instruction so it increments the instruction pointer to the next instruction and it uses 1 additional register to store the value. 但不同之处在于,#
#define
取代了foo
的值, static const
是一条指令,因此它将指令指针递增到下一条指令,并使用1个额外的寄存器来存储该值。
By this, we can say that macro is better than static constants but the difference is minimum. 通过这个,我们可以说宏比静态常量好,但差异是最小的。
EDIT: When using -O3
compilation option (ie at optimization on) both the test1.c and test2.c evaluates the same. 编辑:当使用
-O3
编译选项(即在优化时),test1.c和test2.c都评估相同。
0000000000000000 <main>:
0: 48 83 ec 28 sub rsp,0x28
4: e8 00 00 00 00 call 9 <main+0x9>
9: 48 8d 0d 00 00 00 00 lea rcx,[rip+0x0] # 10 <main+0x10>
10: ba 06 00 00 00 mov edx,0x6
15: e8 00 00 00 00 call 1a <main+0x1a>
1a: 31 c0 xor eax,eax
1c: 48 83 c4 28 add rsp,0x28
20: c3 ret
21: 90 nop
So, gcc
treats both static const
and #define
as the same when it optimize. 因此,
gcc
在优化时将static const
和#define
视为相同。
If the constant's definition is visible to the translation, the compiler is certainly capable of utilizing that as an optimization. 如果常量的定义对于转换是可见的,则编译器当然能够将其用作优化。
this would create a variable called constant that would live on the stack which I assume would come with some more baggage than a #define.
这将创建一个名为constant的变量,它将存在于堆栈中,我假设它会带来比#define更多的行李。
It could "live" in multiple places. 它可以在多个地方“生活”。 A compiler can certainly substitute the constant where referenced, without requiring static or stack storage.
编译器当然可以替换引用的常量,而不需要静态或堆栈存储。
Assuming I need a constant in a situation where I could choose to use either a preprocessor #define or a static const statement with no obvious reasons to choose one over the other, which is more efficient?
假设我需要一个常量,我可以选择使用预处理器#define或静态const语句而没有明显的理由选择一个而不是另一个,哪个更有效?
It depends on the compiler and architecture. 这取决于编译器和架构。 I get the impression that some people believe
#define
has a big advantage. 我觉得有些人认为
#define
有很大的优势。 It doesn't. 它没有。 The obvious case is a complex evaluation or function call (say
sin(4.8)
. Consider a constant used inside a loop. A properly scoped constant could be evaluated once. A define could evaluate on each iteration. 显而易见的情况是复杂的评估或函数调用(比如
sin(4.8)
。考虑在循环中使用的常量。一个适当的作用域常量可以被评估一次。一个定义可以在每次迭代时进行评估。
And how exactly would I go about testing this myself?
我将如何自己测试呢?
Read the assembly produced by each compiler you use, and measure. 阅读您使用的每个编译器生成的程序集,并进行测量。
If you want a rule of thumb, I would say "Use a constant, unless #define
provides you a measurable improvement in the scenario". 如果你想要一个经验法则,我会说“使用常量,除非
#define
为你提供了一个可测量的改进方案”。
There was a good writeup in the GCC docs about this. GCC文档中有一篇关于此的文章。 Maybe somebody remembers where exactly it was.
也许有人会记得它究竟在哪里。
The quick way to test simple optimization questions is to use godbolt . 测试简单优化问题的快速方法是使用godbolt 。
For your specific issue a modern optimizing compiler should be able to produce the same code for both cases and will in fact just optimize them away to a constant. 对于您的特定问题,现代优化编译器应该能够为两种情况生成相同的代码,并且实际上只是将它们优化为常量。 We can see this with the following program ( see it live ):
我们可以通过以下程序看到这一点(现场观看 ):
#include <stdio.h>
#define CONSTANT 6
static const int constant = 6;
void func()
{
printf( "%d\n", constant ) ;
printf( "%d\n", CONSTANT ) ;
}
in both cases both accessing reduce to the following: 在这两种情况下,访问减少到以下:
movl $6, %esi #,
static const
variables are not (at least should not be) created on the stack; static const
变量不是(至少不应该)在堆栈上创建的; space for them is set aside when the program is loaded, so there should not be a runtime penalty associated with their creation. 加载程序时会留出空间,因此不应该存在与其创建相关的运行时损失。
There may be a runtime penalty associated with their initialization. 可能存在与其初始化相关联的运行时惩罚。 although the version of gcc I'm using initializes the constant at compile time;
虽然我使用的gcc版本在编译时初始化常量; I don't know how common that behavior is.
我不知道这种行为有多常见。 If there is such a runtime penalty, it only occurs once at program startup.
如果存在这样的运行时惩罚,则仅在程序启动时发生一次。
Beyond that, any runtime performance difference between a static const
-qualified object and a literal 1 (which is what a macro will eventually expand to) should be negligible to non-existent, depending on the type of the literal and the operation involved. 除此之外,静态
const
限定对象和文字1 (宏将最终扩展到的内容)之间的任何运行时性能差异应该可以忽略不计,具体取决于文字的类型和涉及的操作。
Stupid example ( gcc version 4.1.2 20070115 (SUSE Linux)
): 愚蠢的例子(
gcc version 4.1.2 20070115 (SUSE Linux)
):
#include <stdio.h>
#define FOO_MACRO 5
static const int foo_const = 5;
int main( void )
{
printf( "sizeof FOO_MACRO = %zu\n", sizeof FOO_MACRO );
printf( "sizeof foo_const = %zu\n", sizeof foo_const );
printf( " &foo_const = %p\n", ( void * ) &foo_const );
printf( "FOO_MACRO = %d\n", FOO_MACRO );
printf( "foo_const = %d\n", foo_const );
return 0;
}
Output: 输出:
sizeof FOO_MACRO = 4
sizeof foo_const = 4
&foo_const = 0x400660
FOO_MACRO = 5
foo_const = 5
Address of foo_const
is in the .rodata
section of the binary: foo_const
地址位于二进制文件的.rodata
部分:
[fbgo448@n9dvap997]~/prototypes/static: objdump -s -j .rodata static
static: file format elf64-x86-64
Contents of section .rodata:
40065c 01000200 05000000 73697a65 6f662046 ........sizeof F
^^^^^^^^
40066c 4f4f5f4d 4143524f 203d2025 7a750a00 OO_MACRO = %zu..
40067c 73697a65 6f662066 6f6f5f63 6f6e7374 sizeof foo_const
40068c 203d2025 7a750a00 20202020 20202666 = %zu.. &f
40069c 6f6f5f63 6f6e7374 203d2025 700a0046 oo_const = %p..F
4006ac 4f4f5f4d 4143524f 203d2025 640a0066 OO_MACRO = %d..f
4006bc 6f6f5f63 6f6e7374 203d2025 640a00 oo_const = %d..
Note that the object is already initialized to 5, so there's no runtime initialization penalty. 请注意,该对象已初始化为5,因此没有运行时初始化惩罚。
In the printf
statements, the instruction to load the value of foo_const
into %esi
requires one more byte than the one to load the literal value 0x5
, and the instruction has to effectively dereference the %rip
register: 在
printf
语句中,将foo_const
的值foo_const
到%esi
的指令需要比加载字面值0x5
字节多一个字节,并且指令必须有效地取消引用%rip
寄存器:
400538: be 05 00 00 00 mov $0x5,%esi
^^^^^^^^^^^^^^
40053d: bf ab 06 40 00 mov $0x4006ab,%edi
400542: b8 00 00 00 00 mov $0x0,%eax
400547: e8 e4 fe ff ff callq 400430 <printf@plt>
40054c: 8b 35 0e 01 00 00 mov 270(%rip),%esi # 400660 <foo_const>
^^^^^^^^^^^^^^^^^
400552: bf bb 06 40 00 mov $0x4006bb,%edi
400557: b8 00 00 00 00 mov $0x0,%eax
40055c: e8 cf fe ff ff callq 400430 <printf@plt>
Will this translate into a measurable runtime performance difference? 这会转化为可衡量的运行时性能差异吗? Maybe, under the right cirucmstances.
也许,在正确的环境下。 If you're doing something CPU-bound several hundred thousand times in a tight loop, then yes, using a macro (that resolves to a literal) over a
static const
variable may be measurably faster. 如果你在紧密的循环中做了几十万次CPU绑定,那么是的,使用一个宏(解析为文字)而不是一个
static const
变量可能会快得多。 If this is something that happens once over the lifetime of the program, then the difference is too small to measure and there's no compelling reason to use the macro over the static const
variable. 如果这是在程序的整个生命周期中发生过一次的事情,则差异太小而无法测量,并且没有令人信服的理由将宏用于
static const
变量。
As always, correctness and maintainability matter more than performance 2 . 与往常一样,正确性和可维护性比性能更重要2 。 You're less likely to make a mistake using a
static const
instead of a macro. 使用
static const
而不是宏时,你不太可能犯错误。 Consider the following scenario: 请考虑以下情形:
#define FOO 1+2
...
x = FOO * 3;
What answer would you expect , and what answer would you get ? 你会期待什么答案,你会得到什么答案? Compare that with
与之相比
static const int foo = 1+2;
...
x = foo * 3;
Yes, you could fix the macro case by using parentheses - (1 + 2)
. 是的,您可以使用括号 -
(1 + 2)
修复宏案例。 The point is, this scenario isn't an issue if you use the static const
object. 关键是,如果您使用
static const
对象,则此方案不是问题。 It's one less way to shoot yourself in the foot. 这是用脚射击自己的一种方式。
You've totally changed your question. 你完全改变了你的问题。 Here's my answer to your new question:
这是我对你的新问题的回答:
Because we are talking about C, and assuming you are declaring the array on the stack, the answer is actually very interesting. 因为我们正在谈论C,并且假设你在堆栈上声明数组,答案实际上非常有趣。 In this case, it is not possible for there to be any difference between the two.
在这种情况下,两者之间不可能存在任何差异。 The "6" is not actually used at runtime!
“6”实际上并未在运行时使用! Because you are only using it to size an array on the stack, the compiler simply uses this to calculate how much stack space is required for the variable.
因为您只是使用它来调整堆栈上的数组大小,所以编译器只是使用它来计算变量需要多少堆栈空间。
Suppose you have a 32 bit address space, and your local function contains this 6-byte array (myArray), and an unsigned 32 bit integer (myInt). 假设你有一个32位的地址空间,你的本地函数包含这个6字节数组(myArray)和一个无符号32位整数(myInt)。 The compiler creates the following instructions for entering this function: - Write the 4 byte return address to the stack - Move the stack pointer forward by 10 bytes
编译器创建以下用于输入此函数的指令: - 将4字节返回地址写入堆栈 - 将堆栈指针向前移动10个字节
While executing the function, the runtime doesn't know the names or sizes of any variables. 执行该函数时,运行时不知道任何变量的名称或大小。 If your code says
如果你的代码说
myInt = 5;
myArray[myInt] = 25;
then the compiler will have generated these instructions: 然后编译器将生成这些指令:
- write 00000000 00000000 00000000 00000101 starting at address (StackPointer - 4)
- write 00001101 starting at (StackPointer - 10 + (value at Stackpointer - 4))
So you see, the value "6" is not used at runtime. 所以你看,在运行时没有使用值“6”。 In fact, you can write to index 6, 7, 8, whatever you want.
实际上,无论你想要什么,你都可以写入索引6,7,8。 The run-time won't know that you're overflowing the end of the array.
运行时不会知道你正在溢出数组的末尾。 (but depending how you write the code, the compiler may catch the error at compile time)
(但是根据你编写代码的方式,编译器可能会在编译时捕获错误)
I glossed over some details there (no doubt some that I'm not even aware of) but that's the gist of it. 我在那里掩饰了一些细节(毫无疑问,有些我甚至都不知道),但这就是它的要点。 (I welcome your comments)
(我欢迎你的评论)
Defining the 6 as a "const" may actually cause the value to be stored into 4 bytes of useless space, but that won't affect the execution. 将6定义为“const”实际上可能导致将值存储到4个字节的无用空间中,但这不会影响执行。 Obviosuly it will get optimized away because it is never used.
显然它会被优化掉,因为它从未使用过。
But, having said all that, never worry about saving a byte of space. 但是,说了这么多,从不担心节省一个字节的空间。 Code maintainability is way more important.
代码可维护性更重要。 The risk of introducing a single tiny bug, or making your code a tiny bit less readable, these risks are a trillion trillion times more expensive than the the cost of an extra few bytes or an extra processor cycle.
引入单个小错误或使代码可读性稍差的风险,这些风险比额外的几个字节或额外的处理器周期的成本高出一万亿亿倍。 Use constants and enums to take advantage of all the benefits listed here
使用常量和枚举来利用此处列出的所有好处
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.