简体   繁体   中英

switching to another different custom allocator -> propagate to member fields

I profiled my program, and found that changing from standard allocator to a custom one-frame allocator can remove my biggest bottleneck.

Here is a dummy snippet ( coliru link ):-

class Allocator{ //can be stack/heap/one-frame allocator
    //some complex field and algorithm
    //e.g. virtual void* allocate(int amountByte,int align)=0;
    //e.g. virtual void deallocate(void* v)=0;
};
template<class T> class MyArray{
    //some complex field
    Allocator* allo=nullptr;
    public: MyArray( Allocator* a){
        setAllocator(a);
    }
    public: void setAllocator( Allocator* a){
        allo=a;
    }
    public: void add(const T& t){
        //store "t" in some array
    }
    //... other functions
};

However, my one-frame allocator has a drawback - user must be sure that every objects allocated by one-frame allocator must be deleted/released at the end of time-step.

Problem

Here is an example of use-case.

I use the one-frame allocator to store temporary result of M3 (overlapping surface from collision detection; wiki link ) in Physics Engine.

Here is a snippet.
M1 , M2 and M3 are all manifolds, but in different level of detail :-

Allocator oneFrameAllocator;
Allocator heapAllocator;
class M1{};   //e.g. a single-point collision site
class M2{     //e.g. analysed many-point collision site
    public: MyArray<M1> m1s{&oneFrameAllocator};
};
class M3{     //e.g. analysed collision surface
    public: MyArray<M2> m2s{&oneFrameAllocator};
};

Notice that I set default allocator to be oneFrameAllocator (because it is CPU-saver).
Because I create instance of M1 , M2 and M3 only as temporary variables, it works.

Now, I want to cache a new instance of M3 outout_m3=m3; for the next timeStep .
(^ To check whether a collision is just start or just end)

In other words, I want to copy one-frame allocated m3 to heap allocated output_m3 at #3 (shown below).

Here is the game-loop :-

int main(){
    M3 output_m3; //must use "heapAllocator" 
    for(int timeStep=0;timeStep<100;timeStep++){
        //v start complex computation #2
        M3 m3;
        M2 m2;
        M1 m1;
        m2.m1s.add(m1);
        m3.m2s.add(m2);
        //^ end complex computation
        //output_m3=m3; (change allocator, how?  #3)
        //.... clean up oneFrameAllocator here ....
    }
}

在此输入图像描述

I can't assign output_m3=m3 directly, because output_m3 will copy usage of one-frame allocator from m3 .

My poor solution is to create output_m3 from bottom up.
The below code works, but very tedious.

M3 reconstructM3(M3& src,Allocator* allo){
    //very ugly here #1
    M3 m3New;
    m3New.m2s.setAllocator(allo);
    for(int n=0;n<src.m2s.size();n++){
        M2 m2New;
        m2New.m1s.setAllocator(allo);
        for(int k=0;k<src.m2s[n].m1s.size();k++){
            m2New.m1s.add(src.m2s[n].m1s[k]);
        }
        m3New.m2s.add(m2New);
    }
    return m3New;
}
output_m3=reconstructM3(m3,&heapAllocator);

Question

How to switch allocator of an object elegantly (without propagating everything by hand)?

Bounty Description

  1. The answer doesn't need to base on any of my snippet or any Physics thing. My code may be beyond repair.
  2. IMHO, passing type-of-allocator as a class template parameter (eg MyArray<T,StackAllocator> ) is undesirable.
  3. I don't mind vtable-cost of Allocator::allocate() and Allocator::deallocate() .
  4. I dream for a C++ pattern/tool that can propagate the allocator to members of a class automatically. Perhaps, it is operator=() like MSalters advised, but I can't find a proper way to achieve it.

Reference : After receiving an answer from JaMiT , I found that this question is similar to Using custom allocator for AllocatorAwareContainer data members of a class .

Justification

At its core, this question is asking for a way to use a custom allocator with a multi-level container. There are other stipulations, but after thinking about this, I've decided to ignore some of those stipulations. They seem to be getting in the way of solutions without a good reason. That leaves open the possibility of an answer from the standard library: std::scoped_allocator_adaptor and std::vector .

Perhaps the biggest change with this approach is tossing the idea that a container's allocator needs to be modifiable after construction (toss the setAllocator member). That idea seems questionable in general and incorrect in this specific case. Look at the criteria for deciding which allocator to use:

  • One-frame allocation requires the object be destroyed by the end of the loop over timeStep .
  • Heap allocation should be used when one-frame allocation cannot.

That is, you can tell which allocation strategy to use by looking at the scope of the object/variable in question. (Is it inside or outside the loop body?) Scope is known at construction time and does not change (as long as you don't abuse std::move ). So the desired allocator is known at construction time and does not change. However, the current constructors do not permit specifying an allocator. That is something to change. Fortunately, such a change is a fairly natural extension of introducing scoped_allocator_adaptor .

The other big change is tossing the MyArray class. Standard containers exist to make your programming easier. Compared to writing your own version, the standard containers are faster to implement (as in, already done) and less prone to error (the standard strives for a higher bar of quality than "works for me this time"). So out with the MyArray template and in with std::vector .

How to do it

The code snippets in this section can be joined into a single source file that compiles. Just skip over my commentary between them. (This is why only the first snippet includes headers.)

Your current Allocator class is a reasonable starting point. It just needs a pair of methods that indicate when two instances are interchangeable (ie when both are able to deallocate memory that was allocated by either of them). I also took the liberty of changing amountByte to an unsigned type, since allocating a negative amount of memory does not make sense. (I left the type of align alone though, since there is no indication of what values this would take. Possibly it should be unsigned or an enumeration.)

#include <cstdlib>
#include <functional>
#include <scoped_allocator>
#include <vector>

class Allocator {
public:
    virtual void * allocate(std::size_t amountByte, int align)=0;
    virtual void deallocate(void * v)=0;
    //some complex field and algorithm

    // **** Addition ****
    // Two objects are considered equal when they are interchangeable at deallocation time.
    // There might be a more refined way to define this relation, but without the internals
    // of Allocator, I'll go with simply being the same object.
    bool operator== (const Allocator & other) const  { return this == &other; }
    bool operator!= (const Allocator & other) const  { return this != &other; }
};

Next up are the two specializations. Their details are outside the scope of the question, though. So I'll just mock up something that will compile (needed since one cannot directly instantiate an abstract base class).

// Mock-up to allow defining the two allocators.
class DerivedAllocator : public Allocator {
public:
    void * allocate(std::size_t amountByte, int)  override { return std::malloc(amountByte); }
    void   deallocate(void * v)                   override { std::free(v); }
};
DerivedAllocator oneFrameAllocator;
DerivedAllocator heapAllocator;

Now we get into the first meaty chunk – adapting Allocator to the standard's expectations. This consists of a wrapper template whose parameter is the type of object being constructed. If you can parse the Allocator requirements , this step is simple. Admitedly, parsing the requirements is not simple since they are designed to cover "fancy pointers".

// Standard interface for the allocator
template <class T>
struct AllocatorOf {

    // Some basic definitions:

    //Allocator & alloc; // A plain reference is an option if you don't support swapping.
    std::reference_wrapper<Allocator> alloc; // Or a pointer if you want to add null checks.
    AllocatorOf(Allocator & a) : alloc(a) {} // Note: Implicit conversion allowed

    // Maybe this value would come from a helper template? Tough to say, but as long as
    // the value depends solely on T, the value can be a static class constant.
    static constexpr int ALIGN = 0;

    // The things required by the Allocator requirements:

    using value_type = T;
    // Rebind from other types:
    template <class U>
    AllocatorOf(const AllocatorOf<U> & other) : alloc(other.alloc) {}
    // Pass through to Allocator:
    T *  allocate  (std::size_t n)        { return static_cast<T *>(alloc.get().allocate(n * sizeof(T), ALIGN)); }
    void deallocate(T * ptr, std::size_t) { alloc.get().deallocate(ptr); }
    // Support swapping (helps ease writing a constructor)
    using propagate_on_container_swap = std::true_type;
};
// Also need the interchangeability test at this level.
template<class T, class U>
bool operator== (const AllocatorOf<T> & a_t, const AllocatorOf<U> & a_u)
{ return a_t.get().alloc == a_u.get().alloc; }
template<class T, class U>
bool operator!= (const AllocatorOf<T> & a_t, const AllocatorOf<U> & a_u)
{ return a_t.get().alloc != a_u.get().alloc; }

Next up are the manifold classes. The lowest level (M1) does not need any changes.

The mid-levels (M2) need two additions to get the desired results.

  1. The member type allocator_type needs to be defined. Its existence indicates that the class is allocator-aware.
  2. There needs to be a constructor that takes, as parameters, an object to copy and an allocator to use. This makes the class actually allocator-aware. (Potentially other constructors with an allocator parameter would be required, depending on what you actually do with these classes. The scoped_allocator works by automatically appending the allocator to the provided construction parameters. Since the sample code makes copies inside the vectors, a "copy-plus-allocator" constructor is needed.)

In addition, for general use, the mid-levels should get a constructor whose lone parameter is an allocator. For readability, I'll also bring back the MyArray name (but not the template).

The highest level (M3) just needs the constructor taking an allocator. Still, the two type aliases are useful for readability and consistency, so I'll throw them in as well.

class M1{};   //e.g. a single-point collision site

class M2{     //e.g. analysed many-point collision site
public:
    using allocator_type = std::scoped_allocator_adaptor<AllocatorOf<M1>>;
    using MyArray        = std::vector<M1, allocator_type>;

    // Default construction still uses oneFrameAllocator, but this can be overridden.
    explicit M2(const allocator_type & alloc = oneFrameAllocator) : m1s(alloc) {}
    // "Copy" constructor used via scoped_allocator_adaptor
    //M2(const M2 & other, const allocator_type & alloc) : m1s(other.m1s, alloc) {}
    // You may want to instead delegate to the true copy constructor. This means that
    // the m1s array will be copied twice (unless the compiler is able to optimize
    // away the first copy). So this would need to be performance tested.
    M2(const M2 & other, const allocator_type & alloc) : M2(other)
    {
        MyArray realloc{other.m1s, alloc};
        m1s.swap(realloc); // This is where we need swap support.
    }

    MyArray m1s;
};

class M3{     //e.g. analysed collision surface
public:
    using allocator_type = std::scoped_allocator_adaptor<AllocatorOf<M2>>;
    using MyArray        = std::vector<M2, allocator_type>;

    // Default construction still uses oneFrameAllocator, but this can be overridden.
    explicit M3(const allocator_type & alloc = oneFrameAllocator) : m2s(alloc) {}

    MyArray m2s;
};

Let's see... two lines added to Allocator (could be reduced to just one), four-ish to M2 , three to M3 , eliminate the MyArray template, and add the AllocatorOf template. That's not a huge difference. Well, a little more than that count if you want to leverage the auto-generated copy constructor for M2 (but with the benefit of fully supporting the swapping of vectors). Overall, not that drastic a change.

Here is how the code would be used:

int main()
{
    M3 output_m3{heapAllocator};
    for ( int timeStep = 0; timeStep < 100; timeStep++ ) {
        //v start complex computation #2
        M3 m3;
        M2 m2;
        M1 m1;
        m2.m1s.push_back(m1);  // <-- vector uses push_back() instead of add()
        m3.m2s.push_back(m2);  // <-- vector uses push_back() instead of add()
        //^ end complex computation
        output_m3 = m3; // change to heap allocation
        //.... clean up oneFrameAllocator here ....
    }    
}

The assignment seen here preserves the allocation strategy of output_m3 because AllocatorOf does not say to do otherwise. This seems to be what should be the desired behavior, not the old way of copying the allocation strategy. Note that if both sides of an assignment already use the same allocation strategy, it doesn't matter if the strategy is preserved or copied. Hence, existing behavior should be preserved with no need for further changes.

Aside from specifying that one variable uses heap allocation, use of the classes is no messier than it was before. Since it was assumed that at some point there would be a need to specify heap allocation, I don't see why this would be objectionable. Use the standard library – it's there to help.

Since you're aiming at performance, I imply that your classes would not manage the lifetime of allocator itself, and would simply use it's raw pointer. Also, since you're changing storage, copying is inevitable. In this case, all you need is to add a "parametrized copy constructor" to each class, eg:

template <typename T> class MyArray {
    private:
        Allocator& _allocator;

    public:
        MyArray(Allocator& allocator) : _allocator(allocator) { }
        MyArray(MyArray& other, Allocator& allocator) : MyArray(allocator) {
            // copy items from "other", passing new allocator to their parametrized copy constructors
        }
};

class M1 {
    public:
        M1(Allocator& allocator) { }
        M1(const M1& other, Allocator& allocator) { }
};

class M2 {
    public:
        MyArray<M1> m1s;

    public:
        M2(Allocator& allocator) : m1s(allocator) { }
        M2(const M2& other, Allocator& allocator) : m1s(other.m1s, allocator) { }
};

This way you can simply do:

M3 stackM3(stackAllocator);
// do processing
M3 heapM3(stackM3, heapAllocator); // or return M3(stackM3, heapAllocator);

to create other-allocator-based copy.

Also, depeding on your actual code structure, you can add some template magic to automate things:

template <typename T> class MX {
    public:
        MyArray<T> ms;

    public:
        MX(Allocator& allocator) : ms(allocator) { }
        MX(const MX& other, Allocator& allocator) : ms(other.ms, allocator) { }
}

class M2 : public MX<M1> {
    public:
        using MX<M1>::MX; // inherit constructors
};

class M3 : public MX<M2> {
    public:
        using MX<M2>::MX; // inherit constructors
};

I realize this isn't the answer to your question - but if you only need the object for the next cycle ( and not future cycles past that ), can you just keep two one-frame allocators destroying them on alternate cycles?

Since you are writing the allocator yourself this could be handled directly in the allocator where the clean-up function knows if this is an even or odd cycle.

Your code would then look something like:

int main(){
    M3 output_m3; 
    for(int timeStep=0;timeStep<100;timeStep++){
        oneFrameAllocator.set_to_even(timeStep % 2 == 0);
        //v start complex computation #2
        M3 m3;
        M2 m2;
        M1 m1;
        m2.m1s.add(m1);
        m3.m2s.add(m2);
        //^ end complex computation
        output_m3=m3; 
        oneFrameAllocator.cleanup(timestep % 2 == 1); //cleanup odd cycle
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM