[英]removing constexpr from a variable capturing a constexpr function return value removes compile-time evaluation
Consider the following constexpr
function, static_strcmp
, which uses C++17's constexpr
char_traits::compare
function: 考虑以下
constexpr
函数static_strcmp
,它使用C ++ 17的constexpr
char_traits::compare
函数:
#include <string>
constexpr bool static_strcmp(char const *a, char const *b)
{
return std::char_traits<char>::compare(a, b,
std::char_traits<char>::length(a)) == 0;
}
int main()
{
constexpr const char *a = "abcdefghijklmnopqrstuvwxyz";
constexpr const char *b = "abc";
constexpr bool result = static_strcmp(a, b);
return result;
}
godbolt shows this gets evaluated at compile-time, and optimised down to: godbolt显示这在编译时得到评估,并优化到:
main: xor eax, eax ret
Remove constexpr
from bool result
: 从
bool result
删除constexpr
:
If we remove the constexpr
from constexpr bool result
, now the call is no longer optimised. 如果我们从
constexpr bool result
删除constexpr
,现在调用不再优化。
#include <string>
constexpr bool static_strcmp(char const *a, char const *b)
{
return std::char_traits<char>::compare(a, b,
std::char_traits<char>::length(a)) == 0;
}
int main()
{
constexpr const char *a = "abcdefghijklmnopqrstuvwxyz";
constexpr const char *b = "abc";
bool result = static_strcmp(a, b); // <-- note no constexpr
return result;
}
godbolt shows we now call into memcmp
: godbolt显示我们现在调用
memcmp
:
.LC0: .string "abc" .LC1: .string "abcdefghijklmnopqrstuvwxyz" main: sub rsp, 8 mov edx, 26 mov esi, OFFSET FLAT:.LC0 mov edi, OFFSET FLAT:.LC1 call memcmp test eax, eax sete al add rsp, 8 movzx eax, al ret
Add a short circuiting length
check: 添加短路
length
检查:
if we first compare char_traits::length
for the two arguments in static_strcmp
before calling char_traits::compare
, without constexpr
on bool result
, the call is optimised away again. 如果我们在调用
char_traits::compare
之前首先比较static_strcmp
两个参数的char_traits::length
, 而不在bool result
上使用 constexpr
,则再次优化调用。
#include <string>
constexpr bool static_strcmp(char const *a, char const *b)
{
return
std::char_traits<char>::length(a) == std::char_traits<char>::length(b)
&& std::char_traits<char>::compare(a, b,
std::char_traits<char>::length(a)) == 0;
}
int main()
{
constexpr const char *a = "abcdefghijklmnopqrstuvwxyz";
constexpr const char *b = "abc";
bool result = static_strcmp(a, b); // <-- note still no constexpr!
return result;
}
godbolt shows we're back to the call being optimised away: godbolt显示我们回到了被优化的电话:
main: xor eax, eax ret
constexpr
from the initial call to static_strcmp
cause the constant evaluation to fail? static_strcmp
中删除constexpr
导致常量评估失败? constexpr
, the call to char_traits::length
is evaluated at compile time, so why not the same behaviour without constexpr
in the first version of static_strcmp
? constexpr
,在编译时也会对char_traits::length
的调用进行求值,那么为什么在static_strcmp
的第一个版本中没有constexpr
出现相同的行为? We have three working cases: 我们有三个工作案例:
1) the computed value is required to initialize a constexpr
value or where a compile-time-known value is strictly required (not-type template parameter, size of a C-style array, a test in a static_assert()
, ...) 1)初始化
constexpr
值或严格要求编译时已知值(非类型模板参数,C样式数组的大小, static_assert()
的测试,...时需要计算值... )
2) the constexpr
function uses value not compile-time-known (by example: values received from standard input. 2)
constexpr
函数使用的值不是编译时知道的(例如:从标准输入接收的值)。
3) the constexpr
function receive values compile-time-known but the result goes in a place not compile-time required. 3)
constexpr
函数接收值编译时已知但结果是在一个不需要编译时的地方。
If we ignore the as-if rule, we have that: 如果我们忽略了as-if规则,我们就有:
in case (1) the compiler must compute the value compile-time because the computed value is required compile-time 在情况(1)中,编译器必须计算值编译时,因为计算值是编译时所需的
in case (2) the compiler must compute the value run-time because it's impossible compute it compile-time 在情况(2)中,编译器必须计算值运行时,因为它不可能计算它的编译时间
in case (3) we are in a grey area where the compiler can compute the value compile-time but the computed value isn't strictly required compile-time; 在情况(3)中,我们处于灰色区域,编译器可以在其中计算编译时的值,但计算值不是严格要求编译时; in this case the compiler can choose if compute compile-time or run-time.
在这种情况下,编译器可以选择计算编译时还是运行时。
With the initial code 用初始代码
constexpr bool result = static_strcmp(a, b);
you are in case (1): the compiler must compute compile-time because the result
variable is declared constexpr
. 你遇到的情况是(1):编译器必须计算编译时间,因为
result
变量被声明为constexpr
。
Removing the constexpr
, 删除
constexpr
,
bool result = static_strcmp(a, b); // no more constexpr
your code translate in the grey area (case (3)), where compile-time computation is possible but not strictly required, because the input values are known compile time ( a
and b
) but the result goes where the value isn't compile-time required (an ordinary variable). 你的代码在灰色区域(case(3))中进行转换,其中编译时计算是可能的但不是严格要求的,因为输入值是已知的编译时间(
a
和b
),但结果是在值不是编译的地方 - 所需时间(普通变量)。 So the compiler can choose and, in your case, choose the run-time computation with a version of the function, compile-time computation with another version. 因此编译器可以选择,在您的情况下,选择使用该函数版本的运行时计算,使用另一个版本进行编译时计算。
Note, that nothing in the standard explicitly requires constexpr
function to be called at compile time, see 9.1.5.7 in latest draft: 注意,标准中没有任何内容明确要求在编译时调用
constexpr
函数,请参见最新草案中的9.1.5.7:
A call to a constexpr function produces the same result as a call to an equivalent non-constexpr function in all respects except that (7.1) a call to a constexpr function can appear in a constant expression and (7.2) copy elision is not performed in a constant expression ([class.copy.elision]).
对constexpr函数的调用产生与在所有方面调用等效的非constexpr函数相同的结果,除了(7.1)对constexpr函数的调用可以出现在常量表达式中并且(7.2)不执行复制省略常量表达式([class.copy.elision])。
(emphasizes mine) (强调我的)
Now, when the call appears in constant expression, there is no way compiler can avoid running the function at compile time, so it dutifully obliges. 现在,当调用出现在常量表达式中时,编译器无法避免在编译时运行该函数,因此它尽职尽责。 When it does not (as in your second snippet) it is just a case of missing optimization.
当它没有时(如在你的第二个片段中),它只是缺少优化的情况。 There is no shortage of those around here.
这里的人并不缺。
Your program has undefined behavior, because you always compare strlen(a)
characters. 您的程序具有未定义的行为,因为您总是比较
strlen(a)
字符。 The string b
doesn't have that much characters. 字符串
b
没有那么多字符。
If you modify your strings to be equal length (so your program becomes well-defined), your program will be optimised as you expect. 如果您将字符串修改为相等长度(因此您的程序定义明确),您的程序将按预期进行优化 。
So this is not missed optimization. 所以这不是错过优化。 The compiler would optimize your program, but because it contains undefined behavior, it doesn't optimize it.
编译器会优化您的程序,但由于它包含未定义的行为,因此不会对其进行优化。
Note, that whether it is undefined behavior or not, is not super clear. 注意,无论是否是未定义的行为,都不是很清楚。 Considering that the compiler uses
memcmp
, it thinks that both of the input strings must be at least strlen(a)
long. 考虑到编译器使用
memcmp
,它认为两个输入字符串必须至少为strlen(a)
long。 So according to the behavior of the compiler, it is undefined behavior. 所以根据编译器的行为,它是未定义的行为。
Here's what the current draft standard says about compare: 以下是目前的标准草案中有关比较的内容:
Returns : 0 if for each i in [0, n), X::eq(p[i],q[i]) is
true
;返回 :如果对于[0,n)中的每个i,则返回 0,X :: eq(p [i],q [i])为
true
; else, a negative value if, for some j in [0, n), X::lt(p[j],q[j]) istrue
and for each i in [0, j) X::eq(p[i],q[i]) istrue
;否则,如果对于[0,n)中的某些j,X :: lt(p [j],q [j])为
true
并且对于[0,j)X :: eq(p)中的每个i,则为负值[i],q [i])是true
; else a positive value.否则是正值。
Now, it is not specified whether compare
is allowed to read p[j+1..n)
or q[j+1..n)
(where j
is the index of the first difference). 现在,没有指定是否允许
compare
读取p[j+1..n)
或q[j+1..n)
(其中j
是第一个差异的索引)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.