简体   繁体   中英

Designing the instantiation and destruction of a class using the pimpl idiom

Note: I've rewritten the question to specify my intend clearer, and make it shorter.

I'm designing a part of a library which has a few requirements:

  • None of the implementation details must be visible from public headers.
  • The memory must be managed by the library.
  • The client accesses the information it needs through a handle reference.

In order to achieve this, I use the pimpl idiom.

What I'm creating is a way to instantiate a tree of entries, and the user can after instantiating the tree add additional behavior to each entity. The tree is later used by other parts of the library to perform some actions. The entries in the tree do not have to be copied or moved in memory, after allocation their memory address remains fixed even if the parent in the tree is changed.

Because other parts need to access the implementation, there needs to be some way to access it, while preferably restricting it to client code.

I had multiple methods that I described in my original question, but now I'm going to present the one I've implemented, and I think may be one of the best ways of achieving this.


Current approach

  • Public constructor takes an (owning) pointer to the implementation class. (1)
  • Public destructor. (2)
  • Friendship with the implementation class. (3)
  • Implementation class provides a static method to get access to the implementation class from a reference to the original class. (4)

Entry.h

// Public header
#pragma once
class EntryImpl;
class Entry final
{
private:
    // 3. Friendship with the implementation class
    friend class EntryImpl;
    EntryImpl* const m_Impl;

public:
    // 1. Constructor takes owning pointer to EntryImpl
    Entry(EntryImpl* impl) : m_Impl(impl) { }
    // 2. Public destructor
    ~Entry() { delete m_Impl; }

    // Public APIs here...
};

EntryImpl.h

// Private header
#pragma once
class EntryImpl final
{
public:
    EntryImpl() { }
    ~EntryImpl() { }

    // 4. Provides the library's internals access to the implementation.
    static EntryImpl& Get(Entry& entry) { return *entry.m_Impl; }

    // As an example function
    void DoSomething() { }
    // Other stuff the implementation does here...
};

Tree.h

// Public header
#pragma once
class Entry;
class TreeImpl;
class Tree final
{
private:
    TreeImpl* const m_Impl;

public:
    Tree();
    ~Tree();

    // Public API
    Entry& CreateEntry();

    void DoSomething();
};

Tree.cpp

// Implementation of Tree
#include "Tree.h"
#include "Entry.h"
#include "EntryImpl.h"
#include <vector>
#include <memory>

// Implement the forward-declared class
class TreeImpl
{
public:
    TreeImpl() { }
    ~TreeImpl() { }

    std::vector<std::unique_ptr<Entry>> m_Entries;
};

Tree::Tree() : m_Impl(new TreeImpl()) { }
Tree::~Tree() { delete m_Impl; }

Entry& Tree::CreateEntry()
{
    // 5. Any constructor parameters can be passed to the private EntryImpl
    //    class and is therefore hidden from the client.
    auto entry = std::make_unique<Entry>(new EntryImpl(/* construction params */));
    Entry& entryRef = *entry;
    // Move it into our own collection
    m_Impl->m_Entries.push_back(std::move(entry));
    return entryRef;
}

void Tree::DoSomething()
{
    for (const auto& entryPtr : m_Impl->m_Entries)
    {
        // 6. Can access the implementation from any implementation
        //    code without modifying the Entry or EntryImpl class.
        EntryImpl& entry = EntryImpl::Get(*entryPtr);
        entry.DoSomething();
    }
}

Advantages

  • Construction parameters of Entry are hidden in EntryImpl 's constructor. (5)
  • Any source file in the library code can access EntryImpl without altering Entry or EntryImpl 's files. (6)
  • Works with std::unique_ptr<Entry> , without requiring a special deallocator.

Disadvantages

  • Public destructor allows client code to release the memory of Entry, causing a near immediate crash.
  • Friendship? Though most problems associated with friendship aren't prominent here.

My question solely regard software design. Are there any alternative approaches that may be better for my scenario? Or just methods I'm overlooking.

This is almost a Code Review question now, so you might want to consider posting this on CodeReview.SE . Also, it might not fit well to StackOverflow's philosophy of specific questions with specific answers, no discussion . I'll try to present an alternative nevertheless.


Analysis and critique of (details of) the OP's approach

Entry(EntryImpl* impl) : m_Impl(impl) { }
// 2. Public destructor
~Entry() { delete m_Impl; }

As the OP has already stated, neither of those functions should be called by the user of the library. The destructor invokes Undefined Behaviour if EntryImpl has a non-trivial destructor, for example.

In my opinion, there's not much benefit to preventing users from constructing new Entry objects. In one of the OP's previous approaches, Entry 's constructors were all private. With the OP's current solution, a library user can write:

Entry e(0);

Which creates an object e that cannot be reasonably used. Note that Entry should be noncopyable, since it owns the object the data member pointer points to.

However, regardless of the definition of class Entry , a library user can always create an object that refers to any Entry object by using a pointer. (This is an argument against the original implementation that returned an Entry& from the tree.)


As far as I understand the OP's intentions, an Entry object uses a pointer to "extend" its own storage to some fixed memory on the heap:

class Entry final
{
private:
    EntryImpl* const m_Impl;

Since it is const , you can't reseat the pointer. There's also a 1-to-1 relationship between Entry objects and EntryImpl objects. The library interface however necessarily deals with EntryImpl pointers . Those are what is essentially passed from the library implementation to the library user. The Entry class itself seems to serve only the purpose to establish the 1-to-1 relationship between Entry and EntryImpl objects.

It is still not entirely clear to me what the relation between Entry s and Tree s is. It seems as if each Entry must belong to a Tree , which implies that a Tree object should own all the entries created from it. This in turn implies that whatever the library user gets from Tree::AddEntry should be a view on an entry owned by the tree - that is, a pointer. In this light, you should consider the solution below.


An approach using polymorphism

This approach works (only) if you can share a vtable between the library implementation and the library user. Is this is not the case, you can implement a similar approach using an opaque pointer instead of an interface with virtual functions. This even allows defining the library's interface as a C API (see Hourglass interfaces for C++ APIs ).

Let's take a look at a classic solution to the requirements:

// interface headers:

class IEntry // replacement for `Entry`
{
public:
    // public API as virtual functions
};

class Tree
{
    // [implementation]
public:
    IEntry* AddEntry();
    void DoSomething();
};


// implementation headers:

class EntryImpl : public IEntry
{
    // implementation
};

// implementation of `Tree::AddEntry` returns an `EntryImpl*`

This solution is useful if an entry handle ( IEntry* ) does not own the entry it refers to. By casting from IEntry* to EntryImpl* , the library can communicate with more private parts of the entry. There can even be a second interface for the library that separates EntryImpl from the Tree . No friendship between classes is required for this approach, as far as I can see.

Note that a slightly better solution might be to let the class EntryImpl implement a concept rather than an interface, and wrap EntryImpl objects into an adapter that implements the virtual functions. This allows reusing the EntryImpl class for a different interface.

With the above solution, a library user deals with a pointer:

Tree myTree;
auto myEntry = myTree.AddEntry();
myEntry->SomeFunction();

To document that this pointer does not own the object it points to, you could use what has been named "the world's dumbest smart pointer". Essentially, a lightweight wrapper of a raw pointer that, as a type, expresses that it doesn't own the object it points to:

class Tree
{
    // [implementation]
public:
    non_owning_pointer<IEntry> AddEntry();
    void DoSomething();
};

If you want to allow the user to destruct entries, you should remove them from their tree. Otherwise, you have to deal with destroyed entries explicitly eg in TreeImpl::DoSomething . At this point, we're starting to rebuild a resource management system for entries; the first step of which typically is destruction. However, the library user might have various requirements on the lifetime of their entries. If you simply return a shared_ptr , that might be unnecessary overhead; if you return a unique_ptr , the library user might have to wrap that unique_ptr in a shared_ptr . Even if those solutions don't impact performance very much, I'd consider them strange from a conceptual point of view.

Hence, I'd argue that for the interface , you should stick to the most general way of managing lifetime, which is (as far as I know), similar to a combination of manual "new" and "delete" calls. We cannot use those language features directly, since they also deal with memory.

Removing an entry from its tree requires knowledge of both: the entry and the tree. That is, either you supply both to the destruction function, or you store a tree pointer in each entry. Another way to look at it is: If you already need a TreeImpl* in EntryImpl , you'll get this for free. On the other hand, the library user might already have the Tree* of each entry.

class Tree
{
    // [implementation]
public:
    non_owning_pointer<IEntry> AddEntry();
    void RemoveEntry(non_owning_pointer<IEntry>);

    void DoSomething();
};

(After writing this, this reminds me of iterators; though they also allow getting to the next entry.)

With this interface, you can easily write a unique_ptr<IEntry, ..> and a shared_ptr<IEntry> . For example:

namespace detail
{
    class UnqiueEntryPtr_deleter {
        non_owning_pointer<Tree> owner;
    public:
        UnqiueEntryPtr_deleter(Tree* t) : owner{t} ()
        void operator()(IEntry* p) { owner->RemoveEntry(p); }
    };
}

using unique_entry_ptr = std::unique_ptr<IEntry, UniqueEntryPtr_deleter>;

auto AddEntry(Tree& t) // convenience function
{ return unique_entry_ptr{ t.AddEntry(), &t }; }

Similarly, you can create an object that holds a unique_ptr to an entry and a shared_ptr to its Tree owner. This prevents lifetime issues of Entry* that refer to dead trees.


Lifting the abstraction in PIMPL approaches

Of course, using polymorphism easily allows getting from an IEntry* to an EntryImpl* inside the library. Can we solve the issue also for the PIMPL approach? Yes, either via friendship (as in the OP), or via a function that extracts (a copy of) the PIMPL:

class EntryImpl;
class Entry
{
    EntryImpl* pimpl;
public:
    EntryImpl const* get_pimpl() const;
    EntryImpl* get_pimpl();
};

This does not look very nice, but it is necessary for the parts of the library that are compiled by the user to extract that pointer (for example, the user's compiler could select a different memory layout for Entry objects). As long as EntryImpl is an opaque pointer, one could argue that Entry 's encapsulation is not violated. In fact, EntryImpl could be well-encapsulated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM