简体   繁体   中英

What are the pros and cons of using smart pointers as “non-owning references”?

When an object needs to refer to another object without "owning it" (ie, not responsible for its lifetime), one way is simply to use raw pointers or raw references for this, like in this example:

class Node
{
    std::vector<Edge*> incidentEdges;
};

class Edge
{
    Node* startNode;
    Node* endNode;
};

class Graph
{
    std::vector<std::unique_ptr<Node*>> nodes;
    std::vector<std::unique_ptr<Edge*>> edges;
};

(please save yourself the time to comment on the existence of more efficient data structures for graphs, that's my field of expertise and not the point of the question.)

Graph is responsible for the lifetime of nodes and edges, and responsible to guarantee that the pointers in Node and Edge are not dangling. But if the programmer fails to do so, then there is a risk of undefined behavior.

But with the overhead cost of reference counting, one could strongly enforce that no undefined behaviour can occur using smart pointers. Instead it would gracefully crash. It guarantees that this happens at the earliest possible time (avoid corrupting more data) and don't go unnoticed. Here is one possible implementation:

(edit: fixed implementation, more details in Yakk answer. Huge thanks!)

template <class T>
using owning_ptr = std::shared_ptr<T>;

template <class T>
class nonowning_ptr
{
    std::weak_ptr p_;

public:
    nonowning_ptr() : p_() {}
    nonowning_ptr(const nonowning_ptr & p) : p_(p.p_) {}
    nonowning_ptr(const owning_ptr<T> & p) : p_(p) {}

    // checked dereferencing
    owning_ptr<T> get() const
    { 
        if (auto sp = p_.lock())
        {
            return sp.get();
        }
        else
        {
            logUsefulInfo();
            saveRecoverableUserData();
            nicelyInformUserAboutError();
            abort(); // or throw exception
        }
    }

    T & operator*() const = delete; // cannot be made safe
    owning_ptr<T> operator->() const { return get(); }

    // [...] other methods forwarding weak_ptr functionality 
};

class Node
{
    std::vector<nonowning_ptr<Edge>> incidentEdges;
};

class Edge
{
    nonowning_ptr<Node> startNode;
    nonowning_ptr<Node> endNode;
};

class Graph
{
    std::vector<owning_ptr<Node>>> nodes;
    std::vector<owning_ptr<Edge>>> edges;
};

My question is: apart from the obvious performance vs. safety trade-off, what are the pros and cons of each approach?

I'm not asking which one is the best, there surely is no best and it depends on use cases. I'm asking for factual pros and cons of each method that you may be aware of and that I'm not, which would help take design decisions (maybe, in term of readability? maintainability? portability? playing nice with third-party library? preventing use-after-free exploits? ).

My question is: apart from the obvious performance vs. safety trade-off, what are the pros and cons of each approach?

Ignoring the fact that there are no other questions for smart pointers besides performance and safety (performance is why we don't just let a GC handle it safely), there's the fact that your nonowning_ptr class is horribly broken.

Your get function returns a naked pointer. Yet there is no guarantee anywhere in your code that any user of get will get either a valid pointer or NULL .

The very instant you destroy the shared_ptr returned by weak_ptr::lock , you remove the only thing that keeps that memory valid. Which means that, if someone comes along and deletes the last shared_ptr to that memory while your have your T* , you're screwed.

Threading in particular breaks your illusions of safety.

So the most important "con" of nonowning_ptr is that it's broken; it's no safer than a T* .

Your design has a problem, in that if another thread or path of execution (say, multiple arguments to a function call) modifies the shared_ptr underlying your weak_ptr , the check for lifetime will be made, and before you use it you get UB.

To reduce this, T * get() should be std::shared_ptr<T> get() . And operator-> should also return std::shared_ptr<T> . While this seems impractical, it actually works due to the fun way -> is defined in C++ to auto-recurse. ( a-> is defined as (*a). if a is a pointer type, and (a.operator->())-> otherwise. So your -> returns a shared_ptr , which then has -> called on it, which then returns the pointer. This ensures the lifetime of the pointer you are doing -> on is long enough.)

// checked dereferencing
std::shared_ptr<T> get() const
{ 
  if (auto sp = lock())
    return sp;
  fail();
}

void fail() { abort() } // or whatever
T & operator*() const = delete; // cannot be made safe
std::shared_ptr<T> operator->() const { return get(); } // works, magically

operator std::shared_ptr<T>() const { return lock(); }
std::shared_ptr<T> lock() const { return p_.lock(); }

now p->foo(); is (in effect) p->get()->foo() . The lifetime of the get() shared_ptr return value is longer than the call to foo() , so everything is safe as houses.

There is still a hole in the T& operator() call, where the reference could outlive its owned object, but this at least patches the -> hole.

You could choose to ban the T& operator*() entirely for safety.

A shared_reference<T> could be written to patch that last hole, but operator. isn't available yet.

Similarly, a operator shared_ptr<T>() would be nice, and a .lock() method, to allow temporary multi-line ownership. Maybe even explicit operator bool() but that runs into the "check, then do, but the check might be invalid before the do" problem that shared pointers and file operations have.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM