简体   繁体   English

为什么 consteval/constexpr 和模板元函数之间的编译时间存在如此巨大的差异?

[英]Why is there such a massive difference in compile time between consteval/constexpr and template metafunctions?

I was curious how far I could push gcc as far as compile-time evaluation is concerned, so I made it compute the Ackermann function, specifically with input values of 4 and 1 (anything higher than that is impractical):我很好奇就编译时评估而言,我可以将 gcc 推多远,所以我让它计算了Ackermann函数,特别是输入值为 4 和 1(高于此值的任何值都是不切实际的):

consteval unsigned int A(unsigned int x, unsigned int y)
{
    if(x == 0)
        return y+1;
    else if(y == 0)
        return A(x-1, 1);
    else
        return A(x-1, A(x, y-1));
}

unsigned int result = A(4, 1);

(I think the recursion depth is bounded at ~16K but just to be safe I compiled this with -std=c++20 -fconstexpr-depth=100000 -fconstexpr-ops-limit=12800000000 ) (我认为递归深度限制在 ~16K,但为了安全起见,我用-std=c++20 -fconstexpr-depth=100000 -fconstexpr-ops-limit=12800000000编译了这个)

Not surprisingly, this takes up an obscene amount of stack space (in fact, it causes the compiler to crash if run with the default process stack size of 8mb) and takes several minutes to compute.毫不奇怪,这占用了大量的堆栈空间(实际上,如果以 8mb 的默认进程堆栈大小运行,它会导致编译器崩溃)并且需要几分钟的时间来计算。 However, it does eventually get there so evidently the compiler could handle it.但是,它最终确实到达了那里,因此显然编译器可以处理它。

After that I decided to try implementing the Ackermann function using templates, with metafunctions and partial specialization pattern matching.在那之后,我决定尝试使用模板、元函数和偏特化模式匹配来实现 Ackermann 函数。 Amazingly, the following implementation only takes a few seconds to evaluate:令人惊讶的是,以下实现只需几秒钟即可评估:

template<unsigned int x, unsigned int y>
struct A {
    static constexpr unsigned int value = A<x-1, A<x, y-1>::value>::value;
};

template<unsigned int y>
struct A<0, y> {
    static constexpr unsigned int value = y+1;
};

template<unsigned int x>
struct A<x, 0> {
  static constexpr unsigned int value = A<x-1, 1>::value;
};

unsigned int result = A<4,1>::value;

(compile with -ftemplate-depth=17000 ) (使用-ftemplate-depth=17000编译)

Why is there such a dramatic difference in evaluation time?为什么评估时间会有如此巨大的差异? Aren't these essentially equivalent?这些本质上不是等价的吗? I guess I can understand the consteval solution requiring slightly more memory and evaluation time because semantically it consists of a bunch of function calls, but that doesn't explain why this exact same (non-consteval) function computed at runtime only takes slightly longer than the metafunction version (compiled without optimizations).我想我可以理解需要更多内存和评估时间的consteval解决方案,因为在语义上它由一堆函数调用组成,但这并不能解释为什么在运行时计算的这个完全相同的(非 consteval)函数只需要比元函数版本(未经优化编译)。

Why is consteval so slow?为什么consteval这么慢? I'm almost tempted to conclude that it's being evaluated by a GIMPLE interpreter or something like that.我几乎很想得出结论,它正在由 GIMPLE 解释器或类似的东西进行评估。 Also, how can the metafunction version be so fast?还有,元函数版本怎么能这么快? It's actually not much slower than optimized machine-code.它实际上并不比优化的机器代码慢多少。

In the template version of A , when a particular specialization, say A<2,3> , is instantiated, the compiler remembers this type, and never needs to instantiate it again.A的模板版本中,当一个特定的特化,比如A<2,3>被实例化时,编译器会记住这个类型,并且永远不需要再次实例化它。 This comes from the fact that types are unique, and each "call" to this meta-function is just computing a type.这是因为类型是唯一的,对这个元函数的每次“调用”只是计算一个类型。

The consteval function version is not optimized to do this, and so A(2,3) may be evaluated multiple times, depending on the control flow, resulting in the performance difference you observe. consteval函数版本未针对此进行优化,因此可能会多次评估A(2,3) ,具体取决于控制流,从而导致您观察到的性能差异。 There's nothing stopping compilers from "caching" the results of function calls, but these optimizations likely just haven't been implemented yet.没有什么可以阻止编译器“缓存”函数调用的结果,但这些优化可能还没有实现。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 constexpr 和模板编译时间? - constexpr and template compile time? 为什么我们不能在 consteval 函数中使用编译时“变量”作为模板参数? - Why can't we use compile-time 'variables' in consteval functions as template parameters? 为什么在 consteval function 中使用的 std::reverse 确实编译但不是 constexpr - Why does std::reverse used in consteval function does compile though not constexpr 为什么在 consteval 构造函数中编译时不知道“this”? - Why isn't “this” known at compile time in a consteval constructor? 通过缓存元函数优化编译时性能 - Optimizing compile-time performance by caching metafunctions 为什么递归constexpr模板值不能编译? - Why recursive constexpr template value does not compile? 为什么GCC在编译时不评估constexpr? - Why GCC does not evaluate constexpr at compile time? 变量上的 consteval 与 constexpr - consteval vs constexpr on variables 带有在编译时评估的向量的构建器模式(使用 `consteval`) - Builder patterns with vectors evaluated at compile time (with `consteval`) 用模板替换constexpr(用于在编译时计算常量)吗? - Replace constexpr (used to calculate constant at compile time) with template?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM