Smart or raw pointers when implementing data structures?

Question

I have been reading and experimenting with the standard library's smart pointers, unique_ptr and shared_ptr and although they are obviously great replacements for a lot of cases where raw pointers might be deemed dangerous I am unsure of their use when implementing data structures.

To experiment, I have written an example of a hash map which uses shared_ptr - which according to the Meyer's Effective Modern C++ are roughly double the size of a unique_ptr . For this reason I would like to use unique_ptr but I am kind of stumped due to what I am performing in the Add function, updating and copying.

Does anyone have any suggestions for this problem? Should data structures remain to be written using raw pointers?

#pragma once
#include "core.h"

const int TABLE_SIZE = 256;

template<typename K>
class HashKey {
public:
    unsigned long operator()(const K& p_key) const {
        return (p_key) % TABLE_SIZE;
    }
};

template<typename K, typename T>
class HashNode {
public:
    K m_key;
    T m_value;
    std::shared_ptr<HashNode> next = nullptr;
};


template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
    std::array< std::shared_ptr< HashNode<K, T> >, 128 > m_table;
    F m_hash_function;
    int m_elem_count{ 0 };

    void Add(K p_key, T p_value);
};

template<typename K, typename T, typename F = HashKey<K>>
void HashMap<K, T, F>::Add(K p_key, T p_value)
{
    unsigned long key = m_hash_function(p_key);

    std::shared_ptr<HashNode<K, T>> new_node = std::make_shared<HashNode<K, T>>();
    new_node->m_key = p_key;
    new_node->m_value = p_value;


    if (m_table[key] == nullptr) {
        /* only item in the bucket */
        m_table[key] = std::move(new_node);
        m_elem_count++;
    }
    else {
        /* check if item exists so it is replaced */
        std::shared_ptr< HashNode<K, T> > current = m_table[key];
        std::shared_ptr< HashNode<K, T> > previous = m_table[key];
        while (current != nullptr && p_key != current->m_key ) {
            previous = current;
            current = current->next;
        }
        if (current == nullptr) {
            previous->next = new_node;
            //current = new_node;
            m_elem_count++;
        }
        else {
            current->m_value = p_value;
        }

    }
}

void TestHashMap() {

    HashMap<int, std::string> hash_map;

    hash_map.Add(1, "one");
    hash_map.Add(2, "does");
    hash_map.Add(3, "not");
    hash_map.Add(50, "simply");
    hash_map.Add(11, "waltz");
    hash_map.Add(11, "into");
    hash_map.Add(191, "mordor");

    std::cout << hash_map.m_elem_count << std::endl;
}

Answer 1

The choice of smart pointer depends on how your data structure "owns" the heap-allocated objects .

If you need to simply observe, and not own an object (independently of whether it is heap-allocated or not) , a raw pointer, a reference or an std::reference_wrapper is the appropriate choice.

If you need unique ownership (at most one owner of the heap-allocated object) , then use std::unique_ptr . It has no additional time/memory overhead.

If you need shared ownership (any number of owners of the heap-allocated object) , then use std::shared_ptr . It results in additional time/memory overhead because an additional pointer (to the reference count metadata) has to be stored, and because accessing it is guaranteed to be thread-safe.

There's no need to use std::unique_ptr (in place of a raw pointer) unless you actually need to own the object.

Assuming you need to own the object, there's no need to use std::shared_ptr (in place of an std::unique_ptr ) unless you actually need shared ownership semantics .

In your case, it looks like you have a maximum of heap nodes in your HashMap . Therefore, I'm assuming that you want the HashMap instance to be the unique owner of the nodes .

What type should you use?

template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
    std::array</* ? */, 128 > m_table;
    // ...
};

You have two options:

If you want to store the objects with an heap indirection, use std::unique_ptr , as the unique owner of these heap-allocated object is and will always be the HashMap instance.
If you want to store the objects directly into the HashMap , with no heap indirection, then do not use any pointer at all. This could lead to very big HashMap instances. Interface for accessing the next nodes becomes cumbersome.

Option 1 (store nodes in the heap):

This is the most common, and probably best option.

template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
    std::array<std::unique_ptr<HashNode<K, T>>, 128 > m_table;
    // ...
};

This will result in lighter (in terms of memory footprint) HashMap instances.

Note: using an std::vector in place of an std::array will reduce the size of HashMap significantly, but will introduce an additional heap indirection. This is the common way of implementing a similar data structure. You generally want the HashMap instance to be as lightweight as possible, so that it can be copied/moved/stored efficiently.

There's no need to use smart pointers to connect the nodes between each other, as the nodes are owned exclusively by HashMap . A raw pointer will be sufficient .

template<typename K, typename T>
class HashNode {
public:
    // ...
    HashNode* next_ptr = nullptr;
    auto& next() 
    { 
        assert(next_ptr != nullptr);
        return *next_ptr;
    }
};

The code above will work fine, assuming that the HashMap is still alive when accessing next .

Option 2 (store nodes in the map instance):

template<typename K, typename T, typename F = HashKey<K>>
class HashMap {
public:
    std::array<HashNode<K, T>, 128 > m_table;
    // ...
};

HashMap instances may be enormous, depending on the size of HashNode<K, T> .

If you choose to store the nodes directly into the HashMap with no heap indirection, you will have to use an index to access the internal array, as moving/copying the HashMap around will change the memory address of the nodes.

template<typename K, typename T>
class HashNode {
public:
    // ...
    int next_index = -1;
    auto& next(HashMap& map) 
    { 
        assert(next_index != -1);
        return map.m_table[next_index]; 
    }
};

Smart or raw pointers when implementing data structures?

Question

1 answers

solution1
9 ACCPTED 2016-01-07 16:48:25

Smart or raw pointers when implementing data structures?

Question

1 answers

solution1 9 ACCPTED 2016-01-07 16:48:25

solution1
9 ACCPTED 2016-01-07 16:48:25