为什么捕获无状态 lambda 有时会导致大小增加？

Question

Given a chain of lambdas where each one captures the previous one by value:给定一个 lambda 链，其中每个都通过值捕获前一个：

auto l1 = [](int a, int b) { std::cout << a << ' ' << b << '\n'; };
auto l2 = [=](int a, int b) { std::cout << a << '-' << b << '\n'; l1(a, b); };
auto l3 = [=](int a, int b) { std::cout << a << '#' << b << '\n'; l2(a, b); };
auto l4 = [=](int a, int b) { std::cout << a << '%' << b << '\n'; l3(a, b); };

std::cout << sizeof(l4);

We can observe, that the resulting sizeof of l4 is equal to 1 .我们可以观察到，得到的l4的sizeof等于1 。

That makes sense to me.这对我来说很有意义。 We are capturing lambdas by value and each of those objects has to have sizeof equal to 1 , but since they are stateless, an optimization similar to [[no_unique_address]] one applies (especially since they all have unique types).我们按值捕获 lambda，每个对象的sizeof必须等于1 ，但由于它们是无状态的，因此适用类似于[[no_unique_address]]的优化（特别是因为它们都有唯一的类型）。

However, when I try to create a generic builder for chaining comparators, this optimization no longer takes place :但是，当我尝试为链接比较器创建通用构建器时，不再发生这种优化：

template <typename Comparator>
auto comparing_by(Comparator&& comparator) {
    return comparator;
}

template <typename Comparator, typename... Comparators>
auto comparing_by(Comparator&& comparator, Comparators&&... remaining_comparators) {
    return [=](auto left, auto right) {
        auto const less = comparator(left, right);
        auto const greater = comparator(right, left);
        if (!less && !greater) {
            return comparing_by(remaining_comparators...)(left, right);
        }
        return less;
    };
}

struct triple {
    int x, y, z;
};

auto main() -> int {
    auto by_x = [](triple left, triple right) { return left.x < right.x; };
    auto by_y = [](triple left, triple right) { return left.y < right.y; };
    auto by_z = [](triple left, triple right) { return left.z < right.z; };

    auto comparator = comparing_by(by_x, by_z, by_y);

    std::cout << sizeof(comparator);
}

Note 1: I am aware of the fact that comparing_by is inefficient and sometimes calls the comparator in a redundant fashion.注 1：我知道 compare_by 效率低下，有时以冗余方式调用comparing_by器。

Why in the above case the resulting sizeof of comparator is equal to 3 and not to 1 ?为什么在上述情况下， comparator的结果sizeof等于3而不是1 ？ It is still stateless, after all.毕竟，它仍然是无国籍的。 Where am I wrong?我哪里错了？ Or is it just a missed optimization in all of the big three compilers?或者它只是所有三大编译器中错过的优化？

Note 2: This is purely an academic question.注2：这纯粹是一个学术问题。 I am not trying to solve any particular problem.我不是想解决任何特定的问题。

Answer 1

What's happening in the first example is not what you think it is.第一个例子中发生的事情不是你想象的那样。 Let's say l1 has type L1 , l2 L2 , etc. These are the members of those types:假设l1具有类型L1 、 l2 L2等。这些是这些类型的成员：

struct L1 {
   // empty;
};

sizeof(L1) == 1

struct L2 {
    L1 l1;
};

sizeof(L2) == sizeof(L1)  // 1

struct L3 {
    L2 l2;
};

sizeof(L3) == sizeof(L2)  // 1

struct L4 {
    L3 l3;
};

sizeof(L4) == sizeof(L3)  // 1

And in your next example, you capture all the lambdas by value, so the closure type has three non-overlapping members, so the size will be at least 3.在您的下一个示例中，您按值捕获所有 lambda，因此闭包类型具有三个不重叠的成员，因此大小至少为 3。

[[no_unique_address]] can't be generically applied to the data members of a closure type (consider a empty class that puts its address in a global map). [[no_unique_address]]通常不能应用于闭包类型的数据成员（考虑一个空的 class 将其地址放在全局映射中）。

The compiler could use empty base optimisation for a "well behaved type" (a trivilly-copyable empty type maybe?), so this might be a missed optimisation.编译器可以对“行为良好的类型”（可能是可简单复制的空类型？）使用空基优化，因此这可能是一个错过的优化。 The standard says this about what can be done ([expr.prim.lambda.closure]p2):该标准说明了可以做什么（[expr.prim.lambda.closure]p2）：

The closure type is not an aggregate type.闭包类型不是聚合类型。 An implementation may define the closure type differently from what is described below provided this does not alter the observable behavior of the program other than by changing:一个实现可以定义不同于下面描述的闭包类型，前提是这不会改变程序的可观察行为，除非改变：

the size and/or alignment of the closure type,闭合类型的尺寸和/或 alignment，

whether the closure type is trivially copyable ([class.prop]), or闭包类型是否可以简单复制（[class.prop]），或者

whether the closure type is a standard-layout class ([class.prop]).闭包类型是否为标准布局 class ([class.prop])。

So the change in size is OK, but it would have to be done so that is_empty_v<lambda_that_captures_stateless_lambda> is not true (since that's an observable behaviour)所以大小的变化是可以的，但必须这样做，以便is_empty_v<lambda_that_captures_stateless_lambda>不是true （因为这是一个可观察的行为）

To "manually" apply this optimisation, you can, instead of calling the lambda comparator(left, right) , default construct something of the type of the closure type and call that ( decltype(comparator){}(left, right) ).要“手动”应用此优化，您可以不调用 lambda comparator(left, right) ，而是默认构造闭包类型的类型并调用它（ decltype(comparator){}(left, right) ）。 I've implemented that here: https://godbolt.org/z/73M1Gd3o5我在这里实现了： https://godbolt.org/z/73M1Gd3o5

为什么捕获无状态 lambda 有时会导致大小增加？

问题描述

1 个解决方案

解决方案1
4 已采纳 2022-04-22 09:07:21

为什么捕获无状态 lambda 有时会导致大小增加？

问题描述

1 个解决方案

解决方案1 4 已采纳 2022-04-22 09:07:21

解决方案1
4 已采纳 2022-04-22 09:07:21