C 和 C++ 之间运行时多态性的性能差异

Question

I know benchmarking is a very delicate subject and simple, not-well-thought-out benchmarks are mostly meaningless for performance comparisons, but what I have right now is actually a pretty small and contrived example that I think should be easily explainable.我知道基准测试是一个非常微妙的主题，简单的、没有经过深思熟虑的基准测试对于性能比较几乎没有意义，但我现在所拥有的实际上是一个非常小的和做作的例子，我认为应该很容易解释。 So, even if the question seems unhelpful, it would at least help me in understanding benchmarking.所以，即使这个问题看起来没有帮助，它至少会帮助我理解基准测试。

So, here I go.所以，这里我是 go。

I was trying to experiment with simple API design in C, using run-time polymorphism kind of behaviour via void * .我试图在 C 中尝试简单的 API 设计，通过void *使用运行时多态性行为。 Then I compared it with same thing implemented in C++ using regular virtual functions.然后我将它与使用常规虚函数在 C++ 中实现的相同事物进行了比较。 Here is the code:这是代码：

#include <cstdlib>
#include <cstdio>
#include <cstring>

int dummy_computation()
{
    return 64 / 8;
}

/* animal library, everything is prefixed with al for namespacing */
#define AL_SUCCESS 0;
#define AL_UNKNOWN_ANIMAL 1;
#define AL_IS_TYPE_OF(animal, type) \
    strcmp(((type *)animal)->animal_type, #type) == 0\

typedef struct {
    const char* animal_type;
    const char* name;
    const char* sound;
} al_dog;

inline int make_dog(al_dog** d) {
    *d = (al_dog*) malloc(sizeof(al_dog));
    (*d)->animal_type = "al_dog";
    (*d)->name = "leslie";
    (*d)->sound = "bark";
    return AL_SUCCESS;
}

inline int free_dog(al_dog* d) {
    free(d);
    return AL_SUCCESS;
}
    
typedef struct {
    const char* animal_type;
    const char* name;
    const char* sound;
} al_cat;

inline int make_cat(al_cat** c) {
    *c = (al_cat*) malloc(sizeof(al_cat));
    (*c)->animal_type = "al_cat";
    (*c)->name = "garfield";
    (*c)->sound = "meow";
    return AL_SUCCESS;
}

inline int free_cat(al_cat* c) {
    free(c);
    return AL_SUCCESS;
}

int make_sound(void* animal) {
    if(AL_IS_TYPE_OF(animal, al_cat)) {
        al_cat *c = (al_cat*) animal;
        return dummy_computation();
    } else if(AL_IS_TYPE_OF(animal, al_dog)) {
        al_dog *d = (al_dog*) animal;
        return dummy_computation();
    } else {
        printf("unknown animal\n");
        return 0;
    }
}
/* c style library finishes here */

/* cpp library with OOP */
struct animal {
    animal(const char* n, const char* s) 
    :name(n)
    ,sound(s)
    {} 
    virtual int make_sound() {
        return dummy_computation();
    }
    const char* name;
    const char* sound;
};

struct cat : animal {
    cat() 
    :animal("garfield", "meow")
    {}
};

struct dog : animal {
    dog() 
    :animal("leslie", "bark")
    {}
};
/* cpp library finishes here */

I have something called dummy_computation , just to make sure I get some computational thingy going on in the benchmark.我有一个叫做dummy_computation的东西，只是为了确保我在基准测试中得到一些计算。 I would normally implement different printf calls for barking, meowing etc. for such an example but printf is not easily benchmarkable in quick-benchmarks.com.对于这样的示例，我通常会实现不同的printf调用来进行吠叫、喵喵叫等，但printf在快速基准测试中不容易进行基准测试。com。 The actual thing I want to benchmark is run-time polymorphism implementation.我想要进行基准测试的实际事情是运行时多态性实现。 So that's why I chose to make some small function and used it in both C and C++ implementation as a filler.所以这就是为什么我选择制作一些小的 function 并在 C 和 C++ 实现中使用它作为填充物。

Now, in quick-benchmarks.com, I have a benchmark like following:现在，在 quick-benchmarks.com 中，我有一个如下基准：

static void c_style(benchmark::State& state) {
  // Code inside this loop is measured repeatedly
  for (auto _ : state) {
    al_dog* d = NULL;
    al_cat* c = NULL;

    make_dog(&d);
    make_cat(&c);
    
    int i1 = make_sound(d);
    benchmark::DoNotOptimize(i1);
    int i2 = make_sound(c);
    benchmark::DoNotOptimize(i2);

    free_dog(d);
    free_cat(c);
  }
}
// Register the function as a benchmark
BENCHMARK(c_style);

static void cpp_style(benchmark::State& state) {
  for (auto _ : state) {
    animal* a1 = new dog();
    animal* a2 = new cat();
    int i1 = a1->make_sound();
    benchmark::DoNotOptimize(i1);
    int i2 = a2->make_sound();
    benchmark::DoNotOptimize(i2);
    delete a1;
    delete a2;
  }
}
BENCHMARK(cpp_style);

I added DoNotOptimize calls so that virtual calls would not end up being optimized-out.我添加了DoNotOptimize调用，以便虚拟调用最终不会被优化。

Whole benchmark can be found here, if recreating it seems painful.整个基准可以在这里找到，如果重新创建它看起来很痛苦。

https://quick-bench.com/q/ezul9hDXTjfSWijCfd2LMUUEH1I https://quick-bench.com/q/ezul9hDXTjfSWijCfd2LMUUEH1I

Now, to my surprise, C version comes out 27 times faster in the results.现在，令我惊讶的是，C 版本的结果快了 27 倍。 I expected maybe some performance hits on C++ version because it is a more refined solution but definitely not 27-fold.我预计 C++ 版本可能会出现一些性能问题，因为它是一个更完善的解决方案，但绝对不是 27 倍。

Can someone explain these results?有人可以解释这些结果吗？ Do virtual function calls really incur this much overhead compared to C?与 C 相比，虚拟 function 调用真的会产生这么多开销吗？ Or is it the way I set up this benchmarking experiment that is completely meaningless?还是我设置这个基准测试的方式完全没有意义？ If so, how would one more correctly benchmark such issues?如果是这样，如何更正确地对此类问题进行基准测试？

Answer 1

It's because you're not implementing the same thing.这是因为你没有实现同样的事情。 If you do an if -chain of switch -chain in C, then you have (mathematically) a discriminated union, which is std::variant in C++.如果您在 C 中执行switch链的if链，那么您（在数学上）有一个可区分的联合，即 C++ 中的std::variant 。

If you'd like the C++ version to be ported to C, then you need function pointers.如果您希望将 C++ 版本移植到 C，那么您需要 function 指针。 It'll very likely be equally slow.它很可能会同样缓慢。 The reason behind, virtual means forward compatible: any code, including a library loaded later, can descend from your base and implement the virtual methods.背后的原因， virtual意味着向前兼容：任何代码，包括稍后加载的库，都可以从您的基础下降并实现virtual方法。 It means, sometimes you don't even know at compile-time of your base module what (descendant) classes it might need to handle (the type system is open).这意味着，有时您甚至在基本模块的编译时都不知道它可能需要处理哪些（后代）类（类型系统是开放的）。 Such forward compatibility is not provided for std::variant , which is closed (limited to a fixed list of types).没有为std::variant提供这种前向兼容性，它是封闭的（仅限于固定的类型列表）。

C 和 C++ 之间运行时多态性的性能差异

问题描述

1 个解决方案

解决方案1
4 2022-09-07 14:24:17

C 和 C++ 之间运行时多态性的性能差异

问题描述

1 个解决方案

解决方案1 4 2022-09-07 14:24:17

解决方案1
4 2022-09-07 14:24:17