C ++ / compilation：是否可以设置vptr的大小（全局vtable + 2字节索引）

Question

I posted recently a question about the memory overhead due to virtuality in C++. 我最近发布了一个关于由于C ++中的虚拟性导致的内存开销的问题。 The answers allow me to understand how vtable and vptr works. 答案让我了解vtable和vptr的工作原理。 My problem is the following : I work on supercomputers, I have billions of some objects and consequently I have to care about the memory overhead due to virtuality. 我的问题如下：我在超级计算机上工作，我有数十亿个对象，因此我不得不关心由虚拟引起的内存开销。 After some measures, when I use classes with virtual functions, each derived object has its 8-byte vptr. 经过一些测量，当我使用具有虚函数的类时，每个派生对象都有其8字节的vptr。 This is not negligible at all. 这根本不可忽视。

I wonder if intel icpc or g++ have some configuration/option/parameters, to use "global" vtables and indexes with adjustable precision instead of vptr. 我想知道intel icpc或g ++是否有一些配置/选项/参数，使用“全局”vtable和可调精度的索引而不是vptr。 Because a such thing would allow me to use 2-bytes index (unsigned short int) instead of 8-bytes vptr for billions of objects (and a good reduction of memory overhead). 因为这样的事情将允许我使用2字节索引（unsigned short int）而不是8字节vptr用于数十亿个对象（并且很好地减少了内存开销）。 Is there any way to do that (or something like that) with compilation options ? 有没有办法用编译选项做到这一点（或类似的东西）？

Thank you very much. 非常感谢你。

Answer 1

Unfortunately... not automatically. 不幸的是......不是自动的

But remember than a v-table is nothing but syntactic sugar for runtime polymorphism. 但请记住，v-table只不过是运行时多态性的语法糖。 If you are willing to re-engineer your code, there are several alternatives. 如果您愿意重新设计代码，有几种选择。

External polymorphism 外部多态性
Hand-made v-tables 手工制作的V桌
Hand-made polymorphism 手工制作的多态性

1) External polymorphism 1）外部多态性

The idea is that sometimes you only need polymorphism in a transient fashion. 这个想法是，有时你只需要一种瞬态的多态性。 That is, for example: 也就是说，例如：

std::vector<Cat> cats;
std::vector<Dog> dogs;
std::vector<Ostrich> ostriches;

void dosomething(Animal const& a);

It seems wasteful for Cat or Dog to have a virtual pointer embedded in this situation because you know the dynamic type (they are stored by value). Cat或Dog在这种情况下嵌入虚拟指针似乎很浪费，因为您知道动态类型（它们按值存储）。

External polymorphism is about having pure concrete types and pure interfaces, as well as a simple bridge in the middle to temporarily (or permanently, but it's not what you want here) adapt a concrete type to an interface. 外部多态性是关于具有纯粹的具体类型和纯接口，以及在中间暂时（或永久地，但不是您想要的）简单的桥接器将具体类型适应于接口。

// Interface
class Animal {
public:
    virtual ~Animal() {}

    virtual size_t age() const = 0;
    virtual size_t weight() const = 0;

    virtual void eat(Food const&) = 0;
    virtual void sleep() = 0;

private:
    Animal(Animal const&) = delete;
    Animal& operator=(Animal const&) = delete;
};

// Concrete class
class Cat {
public:
    size_t age() const;
    size_t weight() const;

    void eat(Food const&);
    void sleep(Duration);
};

The bridge is written once and for all: 这座桥是一劳永逸地写的：

template <typename T>
class AnimalT: public Animal {
public:
    AnimalT(T& r): _ref(r) {}

    virtual size_t age() const override { return _ref.age(); }
    virtual size_t weight() const { return _ref.weight(); }

    virtual void eat(Food const& f) override { _ref.eat(f); }
    virtual void sleep(Duration const d) override { _ref.sleep(d); }

private:
    T& _ref;
};

template <typename T>
AnimalT<T> iface_animal(T& r) { return AnimalT<T>(r); }

And you can use it so: 你可以使用它：

for (auto const& c: cats) { dosomething(iface_animal(c)); }

It incurs an overhead of two pointers per item, but only as long as you need polymorphism. 它会产生每个项目两个指针的开销，但只要你需要多态性。

An alternative is to have AnimalT<T> work with values too (instead of references) and providing a clone method, which allows you to chose fully between having a v-pointer or not depending on the situation. 另一种方法是使AnimalT<T>使用值（而不是引用）并提供clone方法，这允许您根据情况在具有v指针之间完全选择。

In this case, I advise using a simple class: 在这种情况下，我建议使用一个简单的类：

template <typename T> struct ref { ref(T& t): _ref(t); T& _ref; };

template <typename T>
T& deref(T& r) { return r; }

template <typename T>
T& deref(ref<T> const& r) { return r._ref; }

And then modify the bridge a bit: 然后修改一下桥：

template <typename T>
class AnimalT: public Animal {
public:
    AnimalT(T r): _r(r) {}

    std::unique_ptr< Animal<T> > clone() const { return { new Animal<T>(_r); } }

    virtual size_t age() const override { return deref(_r).age(); }
    virtual size_t weight() const { return deref(_r).weight(); }

    virtual void eat(Food const& f) override { deref(_r).eat(f); }
    virtual void sleep(Duration const d) override { deref(_r).sleep(d); }

private:
    T _r;
};

template <typename T>
AnimalT<T> iface_animal(T r) { return AnimalT<T>(r); }

template <typename T>
AnimalT<ref<T>> iface_animal_ref(T& r) { return Animal<ref<T>>(r); }

This way you choose when you wanted polymorphic storage and when you do not. 这种方式可以选择何时需要多态存储，何时不需要。

2) Hand-made v-tables 2）手工制作的v-tables

(only easily works on closed hierachies) （只能在封闭的层级上轻松工作）

It is common in C to emulate object orientation by providing one's own v-table mechanism. 在C中通常通过提供自己的v-table机制来模拟面向对象。 Since you appear to know what a v-table is and how the v-pointer works, then you can perfectly work it yourself. 由于您似乎知道v-table是什么以及v-pointer如何工作，因此您可以自己完美地完成它。

struct FooVTable {
    typedef void (Foo::*DoFunc)(int, int);

    DoFunc _do;
};

And then provide a global array for the hierarchy anchored in Foo : 然后为Foo锚定的层次结构提供全局数组：

extern FooVTable const* const FooVTableFoo;
extern FooVTable const* const FooVTableBar;

FooVTable const* const FooVTables[] = { FooVTableFoo, FooVTableBar };

enum class FooVTableIndex: unsigned short {
    Foo,
    Bar
};

Then all you need in your Foo class is to hold onto the most derived type: 那么你在Foo类中所需要的就是保持最派生的类型：

class Foo {
public:

    void dofunc(int i, int j) {
        (this->*(table()->_do))(i, j);
    }

protected:
    FooVTable const* table() const { return FooVTables[_vindex]; }

private:
    FooVTableIndex _vindex;
};

The closed hierarchy is there because of the FooVTables array and the FooVTableIndex enumeration which need be aware of all the types of the hierarchy. 由于FooVTables数组和FooVTableIndex枚举需要知道层次结构的所有类型，因此存在封闭的层次结构。

The enum index can be bypassed though, and by making the array non constant it is possible to pre-initialize to a larger size and then at init having each derived type registering itself there automatically. 可以绕过枚举索引，并且通过使数组非常量，可以预先初始化为更大的大小，然后在init处使每个派生类型自动注册到那里。 Conflicts of indexes are thus detected during this init phase, and it is even possible to have automatic resolution (scanning the array for a free slot). 因此在初始化阶段期间检测到索引的冲突，甚至可以具有自动解决方案（扫描阵列以获得空闲时隙）。

This may be less convenient, but does provide a way to open the hierarchy. 这可能不太方便，但确实提供了打开层次结构的方法。 Obviously it's easier to code before any thread is launched, as we are talking global variables here. 显然，在启动任何线程之前编码更容易，因为我们在这里讨论全局变量。

3) Hand-made polymorphism 3）手工制作的多态性

(only really works for closed hierarchies) （仅适用于封闭层次结构）

The latter is based on my experience exploring the LLVM/Clang codebase. 后者基于我探索LLVM / Clang代码库的经验。 A compiler has the very same problem that you are faced with: for tens or hundreds of thousands of small items a vpointer per item really increases memory consumption, which is annoying. 编译器遇到的问题与您面临的问题完全相同：对于数十或数十万个小项目，每个项目的vpointer确实会增加内存消耗，这很烦人。

Therefore, they took a simple approach: 因此，他们采取了一种简单的方法：

each class hierarchy has a companion enum listing all members 每个类层次结构都有一个列出所有成员的伴随enum
each class in the hierarchy passes its companion enumerator to its base upon construction 层次结构中的每个类在构造时将其伴随enumerator传递给它的基础
virtuality is achieved by switching over the enum and casting appropriately 虚拟性是通过切换enum和适当的转换来实现的

In code: 在代码中：

enum class FooType { Foo, Bar, Bor };

class Foo {
public:
    int dodispatcher() {
        switch(_type) {
        case FooType::Foo:
            return static_cast<Foo&>(*this).dosomething();

        case FooType::Bar:
            return static_cast<Bar&>(*this).dosomething();

        case FooType::Bor:
            return static_cast<Bor&>(*this).dosomething();
        }
        assert(0 && "Should never get there");
    }
private:
    FooType _type;
};

The switches are pretty annoying, but they can be more or less automated playing with some macros and type list. 交换机非常烦人，但它们或多或少可以自动播放一些宏和类型列表。 LLVM typically use a file like: LLVM通常使用如下文件：

 // FooList.inc
 ACT_ON(Foo)
 ACT_ON(Bar)
 ACT_ON(Bor)

and then you do: 然后你做：

 void Foo::dodispatcher() {
     switch(_type) {
 #   define ACT_ON(X) case FooType::X: return static_cast<X&>(*this).dosomething();

 #   include "FooList.inc"

 #   undef ACT_ON
     }

     assert(0 && "Should never get there");
 }

Chris Lattner commented that due to how switches are generated (using a table of code offsets) this produced code similar to that of a virtual dispatch, and thus had about the same amount of CPU overhead, but for a lower memory overhead. Chris Lattner评论说，由于如何生成交换机（使用代码偏移表），这产生的代码类似于虚拟调度的代码，因此具有大约相同的CPU开销量，但是用于较低的内存开销。

Obviously, the one drawback is that Foo.cpp need to include all of the headers of its derived classes. 显然，一个缺点是Foo.cpp需要包含其派生类的所有头。 Which effectively seals the hierarchy. 这有效地密封了层次结构。

I voluntarily presented the solutions from the most open one to the most closed one. 我自愿提出从最开放的解决方案到最封闭的解决方案。 They have various degrees of complexity/flexibility, and it is up to you to choose which one suits you best. 它们具有不同程度的复杂性/灵活性，您可以自行选择最适合您的复杂性/灵活性。

One important thing, in the latter two cases destruction and copies require special care. 一个重要的事情，在后两种情况下，破坏和复制需要特别小心。

C ++ / compilation：是否可以设置vptr的大小（全局vtable + 2字节索引）

问题描述

1 个解决方案

解决方案1
17 已采纳 2012-05-12 10:15:34

C ++ / compilation：是否可以设置vptr的大小（全局vtable + 2字节索引）

问题描述

1 个解决方案

解决方案1 17 已采纳 2012-05-12 10:15:34

解决方案1
17 已采纳 2012-05-12 10:15:34