简体   繁体   中英

Low performance from unordered_map when accessing elements in member template function in derived class

I'm trying to implement a component based architecture on a game engine project. Each GameObject has an unordered_map that holds a pointer to Component base class. At this point, I only have one component derived class, which is the Transform class. I wanted to implement this component based architecture similar to the Unity's convention: I want to get a component of game object by calling the member template function like GetComponent<Transform>() .

Here are the headers:

Component.h

enum type{
    TRANSFORM   // more will be added later
};

class Component // base class
{
public:
    Component() : _owner(NULL) {}
    virtual ~Component(){}
    static type Type;

protected:
    GameObject* _owner;
};

Transform.h

class Transform : public Component
{
public:
    Transform();
    ~Transform();
    static type Type;

    void Rotate(float deg);

    // to be encapsulated later on
    Vector2D _position;
    float _rotation;
    Vector2D _scale;
};

GameObject.h

class GameObject
{
public:
    GameObject();
    ~GameObject();

    void Update();
    //and more member functions

    template<class T>
    T* GetComponent();
private:
    // some more private members
    unordered_map<type, Component*> _componentList; // only 1 component of each type
};

template<class T>
T* GameObject::GetComponent()
{       
    return static_cast<T*>(_componentList[T::Type]);
}

My initial implementation used std::vector for keeping Component* and the application ran at 60 fps (I also have a frame rate controller, which just limits the FPS to 60). When I changed to the unordered_map for accessing those component pointers, the performance went downhill to 15 FPS.

在此输入图像描述

I only draw two quads and I call GetComponent<Transform>() only 6 times per frame at this point, so there is not much going on in the scene.


What I tried?

I tried to use const char* , std::string , type_info and finally enum type as key values for the unordered_map but nothing really helps: all implementations got me 15-16 FPS.

What causes this performance issue? How can I isolate the issue?

I hope I provided enough detail, feel free to ask for more code if necessary

Disclaimer: while the information below should hold regardless, the first basic sanity check given such a drastic difference in performance over such a simple case is to first make sure that build optimizations are on. With that out of the way...

unordered_map is designed ultimately as a fairly large-scale container, preallocating a potentially large number of buckets.

See here: std::unordered_map very high memory usage

And here: How does C++ STL unordered_map resolve collisions?

While computing a hash index is trivial, the amount of memory (and strides between) being accessed for such small unordered_maps could very easily turn into a cache-missing bottleneck with something as accessed frequently as retrieving a component interface out of an entity.

For entity-component systems, typically you don't have that many components associated to an entity -- maybe something like dozens tops, and often only a few. As a result, std::vector is actually a far more suitable structure and primarily in terms of locality of reference (small arrays that can often be accessed repeatedly every time you fetch a component interface out of an entity). While a lesser point, std::vector::operator[] is also a function that is trivially-inlined.

If you want to do even better than std::vector here (but I only recommend this after profiling and determining you need it), provided that you can deduce some common case upper bound, N , for the number of components typically available in an entity, something like this may work even better:

struct ComponentList
{
    Component* components[N];
    Component** ptr;
    int num;
};

Start off by setting ptr to components and access them subsequently through ptr . Inserting a new component increases num . When num >= 4 (rare case), change ptr to point to a dynamically-allocated block with a larger size. When destroying ComponentList , free the dynamically-allocated memory if ptr != components . This wastes a little memory if you're storing less than N elements (though std::vector also typically does this with an initial capacity and the way it grows), but it turns your entity and the component list provided into a completely contiguous structure unless num > N . As a result, you get the best locality of reference and potentially even better results than what you started with (I'm assuming from the significant drop in frame rates that you were fetching components out of entities quite frequently out of loops which is not uncommon in ECS).

Given how frequently component interfaces can be accessed from an entity, and very often in very tight loops, this might be worth the effort.

Nevertheless, your original choice of std::vector was actually a better one given the typical scale of the data (number of components available in an entity). With very tiny data sets, basic linear-time sequential searches often outperform more sophisticated data structures, and you often want to instead focus much more on memory/cache efficiency.

I tried to use const char*, std::string, type_info and finally enum type as key values for the unordered_map but nothing really helps: all implementations got me 15-16 FPS.

Just on this note, for keys you want something that can be compared in constant time like an integral value. One possibility that might be handy in this case is an interned string which just stores an int for a balance between convenience and performance (allowing clients to construct them through a string but compare them through an int during component lookups).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM