简体   繁体   中英

Two pointers alternative to C++ std::vector

In general std::vector object size is 24 bytes, as it is implemented as 3 pointers (each pointer is 8 bytes in size on 64-bit CPUs). These pointers are:

  • begin of vector
  • end of vector
  • end of reserved memory for vector (vector capacity)

Do we have any similar container that offers same interface ( except for the capacity feature ) but is implemented only with two pointers (or perhaps a pointer + size value)?

It makes sense in some cases like my current project. I hold millions of vectors in memory, but I do not need the distinction between vector size and capacity, and memory usage is turning a bottleneck. I have considered a couple of options:

  • std::string implementation may be tricky and include things like short string optimization, which may be even worse.
  • std::unique_ptr to some allocated buffer. The interface is not so convenient as std::vector , and I would end up creating my own class that wraps the pointer plus size of the buffer, which is what I am trying to avoid.

So any ready made alternative? Boost libraries based solutions are accepted.

I do not need to change its size after dynamic allocation (from the comments)

Since you do not need your vectors to be expanded dynamically, std::valarray<T> may be a good alternative. Its size is fixed at construction, so the implementation does not need a third pointer.

Here's the absolute minimum to encapsulate a unique_ptr<T[]> in a first-class value-type that models a RandomAccessRange :

#include <memory>
template <typename T>
struct dyn_array {
    explicit dyn_array(size_t n)
      : _n(n), _data(std::make_unique<T[]>(n)) { }

    auto begin() const { return _data.get(); }
    auto end()   const { return begin() + _n; }
    auto begin()       { return _data.get(); }
    auto end()         { return begin() + _n; }
    auto size()  const { return _n; }
private:
    size_t _n {};
    std::unique_ptr<T[]> _data;
};

That's 15 lines of code. See it Live

int main() {
    using std::begin;
    using std::end;

    for (int n; (n = size_gen(prng)) != 10;) {
        dyn_array<double> data(n);
        std::iota(begin(data), end(data), 0);

        static_assert(sizeof(data) == 2*sizeof(void*));

        fmt::print("Size {} data {}\n", n, data);
    }
}

Printing eg

Size 8 data {0, 1, 2, 3, 4, 5, 6, 7}
Size 6 data {0, 1, 2, 3, 4, 5}
Size 7 data {0, 1, 2, 3, 4, 5, 6}
Size 6 data {0, 1, 2, 3, 4, 5}

I have commented two versions that

  • add constructor guides with initializers live
  • and add copy/move semantics live

BONUS: Single Pointer Size

Building on the above, I realized that the size can be inside the allocation making the size of dyn_array exactly 1 pointer .

Live Demo

#include <memory>
#include <cstdlib> // committing the sin of malloc for optimization

template <typename T>
struct dyn_array {
    dyn_array(std::initializer_list<T> init)
    : _imp(allocate(init.size(), true))
    { std::uninitialized_move(init.begin(), init.end(), begin()); }

    dyn_array(dyn_array const& rhs)
    : _imp(allocate(rhs.size(), true))
    { std::uninitialized_copy(rhs.begin(), rhs.end(), begin()); }

    dyn_array(dyn_array&& rhs)          { rhs.swap(*this); }
    dyn_array& operator=(dyn_array rhs) { rhs.swap(*this); } 

    explicit dyn_array(size_t n = 0)
    : _imp(allocate(n))
    { }

    auto size()  const { return _imp? _imp->_n : 0ull; }
    auto begin() const { return _imp? _imp->_data + 0 : nullptr; }
    auto begin()       { return _imp? _imp->_data + 0 : nullptr; }
    auto end()   const { return begin() + size(); }
    auto end()         { return begin() + size(); }
    auto empty() const { return size() == 0; }

    bool operator==(dyn_array const& rhs) const {
        return size() == rhs.size() &&
            std::equal(rhs.begin(), rhs.end(), begin());
    };

    void swap(dyn_array& rhs) {
        std::swap(_imp, rhs._imp);
    }
private:
    struct Impl {
        size_t _n;
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wpedantic"
        T _data[]; // C99 extension
#pragma GCC diagnostic pop
    };
    struct Deleter {
        void operator()(Impl* s) const { 
            while (s->_n) { s->_data[--(s->_n)].~T(); }
            std::free(s);
        }
    };
    using Ptr = std::unique_ptr<Impl, Deleter>;
    Ptr _imp;

    static Ptr allocate(size_t n, bool uninitialized = false) {
        if (!n)
            return {};

        auto p = std::malloc(sizeof(Impl) + n*sizeof(T)); // could be moreconservative
        auto s = Ptr(reinterpret_cast<Impl*>(p));
        s->_n = n;
        if (!uninitialized)
            std::uninitialized_default_construct_n(s->_data, n);
        return s;
    }
};

Which we can use as:

#include <fmt/ranges.h>

static size_t constructions = 0;
static size_t default_ctor, copy_ctor = 0;
static size_t destructions = 0;

struct Sensor final {
    Sensor()              { ++constructions; ++default_ctor; } 
    Sensor(Sensor const&) { ++constructions; ++copy_ctor; } 
    ~Sensor()             { ++destructions;  } 
};

int main() {
    fmt::print("With initializers: {}, {}\n",
        dyn_array{3.1415f},
        dyn_array{"one", "two", "three"});

    fmt::print("Without: {}, {}\n",
        dyn_array<std::string_view>{3}, 
        dyn_array<float>{1},
        dyn_array<int>{}); // empty by default, no allocation

    auto a = dyn_array{3,2,1};
    fmt::print("sizeof(a) == sizeof(void*)? {}\n", sizeof(a) == sizeof(void*));
    auto copy = a;
    fmt::print("copy: {} == {}? {}\n", copy, a, (copy == a));
    auto move = std::move(copy);
    fmt::print("move: {} == {}? {}\n", move, a, (move == a));
    fmt::print("copy now moved-from: {}, empty? {}\n", copy, copy.empty());

    dyn_array<Sensor>(4); // test destructors
    fmt::print("constructions({}) and destructions({}) match? {}\n",
            constructions, destructions, constructions == destructions);
    fmt::print("all default, no copy ctors? {}\n",
            (copy_ctor == 0) && (default_ctor == constructions));

    dyn_array { Sensor{}, Sensor{}, Sensor{} };
    fmt::print("constructions({}) and destructions({}) match? {}\n",
            constructions, destructions, constructions == destructions);
    fmt::print("initializers({}) were uninitialized-copied: {}\n",
            copy_ctor,
            (copy_ctor == 3) && (default_ctor + copy_ctor == constructions));
}

Printing:

With initializers: {3.1415}, {"one", "two", "three"}
Without: {"", "", ""}, {1}
sizeof(a) == sizeof(void*)? true
copy: {3, 2, 1} == {3, 2, 1}? true
move: {3, 2, 1} == {3, 2, 1}? true
copy now moved-from: {}, empty? true
constructions(4) and destructions(4) match? true
all default, no copy ctors? true
constructions(10) and destructions(10) match? true
initializers(3) were uninitialized-copied: true

Of course you can do this without using C99 flexible array members , but I didn't want to meddle with alignment manually right now.

BONUS: [] Indexing, front , back , at accessors

These are really simple to add:

auto& operator[](size_t n) const { return *(begin() + n); }
auto& operator[](size_t n)       { return *(begin() + n); }
auto& at(size_t n) const { return n<size()?*(begin() + n): throw std::out_of_range("dyn_array::at"); }
auto& at(size_t n)       { return n<size()?*(begin() + n): throw std::out_of_range("dyn_array::at"); }
auto& front() const { return at(0); }
auto& front()       { return at(0); }
auto& back()  const { return at(size()-1); }
auto& back()        { return at(size()-1); }

Of course push_back/erase/etc are out since the size doesn't change after construction.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM