(重新)在非标准容器中使用std :: algorithms

I have a "column" container type: 我有一个“列”容器类型:

struct MyColumnType { 
  // Data: Each row represents a member of an object.
  vector<double> a;   // All vectors are guaranteed to have always
  vector<string> b;   // the same length.
  vector<int> c;

  void copy(int from_pos, int to_pos); // The column type provides an interface
  void swap(int pos_a, int pos_b);     // for copying, swapping, ...

  void push_back();      // And for resizing the container.
  void pop_back();
  void insert(int pos);
  void remove(int pos);
  // The interface can be extended/modified if required

Usage: 用法:

// If table is a constructed container with elements stored 
// To acces the members of the object stored at the 4th position:
table.a[4] = 4.0;
table.b[4] = "4th";
table.c[4] = 4;

Question: How can I create a standard-compliant random access iterator (and probably a required proxy reference type) for this kind of container? 问题:如何为这种容器创建符合标准的随机访问迭代器(可能是必需的代理引用类型)?

I want to be able to use std::algorithms for random access iterators with my type, eg sort (note: for sorting the comparison would be provided by an user-defined functor, eg a lambda). 我希望能够将std::algorithms用于具有我的类型的随机访问迭代器,例如sort (注意:用于排序的比较将由用户定义的函子(例如lambda)提供)。

In particular the iterator should provide an interface similar to 特别是,迭代器应提供类似于以下内容的接口

struct {
  double& a;
  string& b;
  int& c;

Note 0: C++11/C++14 is allowed. 注意0:允许C ++ 11 / C ++ 14。

Note 1: There is an old paper http://hci.iwr.uni-heidelberg.de/vigra/documents/DataAccessors.ps where a similar attempt is undertaken. 注1:有一篇旧论文http://hci.iwr.uni-heidelberg.de/vigra/documents/DataAccessors.ps进行了类似尝试。 However, I haven't been able to get their approach working with sort. 但是,我无法使其排序有效。 Requirements like defaultConstructible are hard to satisfy using a proxy type approach (why does std::sort require types to be default constructible instead of swappable is beyond my understanding). 使用代理类型方法很难满足诸如defaultConstructible之std::sort要求(为什么std::sort要求类型必须是默认可构造的,而不是可交换的类型,这超出了我的理解)。

Note 2: I cannot do the following: 注意2:我无法执行以下操作:

struct MyType {
  double a;
  string b;
  int c;

std::vector<MyType> v;

and then use std::algorithm . 然后使用std::algorithm

Motivation: Performance. 动机:绩效。 A cache-line is usually 64bytes, ie 8 doubles. 高速缓存行通常为64字节,即8倍。 In this simple struct if you iterate over the doubles, you are polluting a cache-line with a string an an int. 在这个简单的结构中,如果您遍历双精度数,则会用字符串an int污染高速缓存行。 In other cases, you might get only 1 double transfered per cache-line. 在其他情况下,每个高速缓存行可能只会转移1次。 That is, you end up using 1/8-th of the memory bandwith available. 也就是说,您最终会使用可用存储带宽的1/8。 If you need to iterate over a couple of Gb of doubles, this simple decision improves your application performance by a factor of 6-7x. 如果您需要迭代两倍的Gb,那么这个简单的决定将您的应用程序性能提高6-7倍。 And no, I cannot give that up. 不,我不能放弃。

Bonus: the answer should be as generic as possible. 奖励:答案应该尽可能通用。 Think about adding/removing fields to the container type as adding/removing members to a struct. 考虑将字段添加/删除到容器类型作为结构的成员添加/删除成员。 You don't want to change a lot of code every time you add a new member. 您不想每次添加新成员时都更改很多代码。

I think something like this could be Standard-compliant. 我认为类似这样的东西可能符合标准。 It uses some C++11 features to simplify the syntax, but could as well be changed to comply C++03 AFAIK. 它使用一些C ++ 11功能来简化语法,但也可以更改为符合C ++ 03 AFAIK。

Tested and works with clang++3.2 经过测试并与clang ++ 3.2配合使用

Prelude: 序幕:

#include <vector>
#include <string>
#include <utility>  // for std::swap
#include <iterator>

using std::vector;
using std::string;

// didn't want to insert all those types as nested classes of MyColumnType
namespace MyColumnType_iterator
    struct all_copy;
    struct all_reference;
    struct all_iterator;

// just provided `begin` and `end` member functions
struct MyColumnType {
    // Data: Each row represents a member of an object.
    vector<double> a;   // All vectors are guaranteed to have always
    vector<string> b;   // the same length.
    vector<int> c;

    void copy(int from_pos, int to_pos); // The column type provides an itface
    void swap(int pos_a, int pos_b);     // for copying, swapping, ...

    void push_back();      // And for resizing the container.
    void pop_back();
    void insert(int pos);
    void remove(int pos);
    // The interface can be extended/modified if required

    using iterator = MyColumnType_iterator::all_iterator;
    iterator begin();
    iterator end();

The iterator classes: a value_type ( all_copy ), a reference type ( all_reference ) and the iterator type ( all_iterator ). 迭代器类: value_typeall_copy ), reference类型( all_reference )和迭代器类型( all_iterator )。 Iterating is done by keeping and updating three iterators (one to each vector ). 通过保留和更新三个迭代器(每个vector一个)来完成迭代。 I don't know if that's the most performant option, though. 不过,我不知道这是否是性能最高的选择。

How it works: std::iterator_traits defines several associated types for an iterator: [iterator.traits]/1 工作原理: std::iterator_traits定义了几种关联的类型:[iterator.traits] / 1

be defined as the iterator's difference type, value type and iterator category, respectively. 分别定义为迭代器的差异类型,值类型和迭代器类别。 In addition, the types 另外,类型
shall be defined as the iterator's reference and pointer types, that is, for an iterator object a, the same type as the type of *a and a-> , respectively 应定义为迭代器的引用和指针类型,即对于迭代器对象a,其类型分别与*aa->的类型相同

Therefore, you can introduce a struct ( all_reference ) keeping three references as reference type. 因此,您可以引入一个结构( all_reference ),将三个引用作为reference类型。 This type is the return value of *a , where a is of the iterator type (possibly const -qualified). 此类型是*a的返回值,其中a是迭代器类型(可能是const限定的)。 There needs to be a different value_type because some Standard Library algorithms such as sort might want to create a local variable temporarily storing the value of *a (by copy or move into the local variable). 由于某些标准库算法(例如sort可能想要创建一个临时存储*a值(通过复制或移入该局部变量)的局部变量,因此必须有一个不同的value_type In this case, all_copy provides this functionality. 在这种情况下, all_copy提供此功能。

You're not required to use it ( all_copy ) in you own loops, where it could affect performance. 您不需要在自己的循环中使用它( all_copy ),因为它可能会影响性能。

namespace MyColumnType_iterator
    struct all_copy;

    struct all_reference
        double& a;
        string& b;
        int& c;

        all_reference() = delete;
        // not required for std::sort, but stream output is simpler to write
        // with this
        all_reference(all_reference const&) = default;
        all_reference(double& pa, string& pb, int& pc)
            : a{pa}
            , b{pb}
            , c{pc}

        // MoveConstructible required for std::sort
        all_reference(all_reference&& other) = default;
        // MoveAssignable required for std::sort
        all_reference& operator= (all_reference&& other)
            a = std::move(other.a);
            b = std::move(other.b);
            c = std::move(other.c);

            return *this;

        // swappable required for std::sort
        friend void swap(all_reference p0, all_reference p1)
            std::swap(p0.a, p1.a);
            std::swap(p0.b, p1.b);
            std::swap(p0.c, p1.c);

        all_reference& operator= (all_copy const& p) = default;
        all_reference& operator= (all_copy&& p) = default;

        // strict total ordering required for std::sort
        friend bool operator< (all_reference const& lhs,
                               all_reference const& rhs);
        friend bool operator< (all_reference const& lhs, all_copy const& rhs);
        friend bool operator< (all_copy const& lhs, all_reference const& rhs);

    struct all_copy
        double a;
        string b;
        int c;

        all_copy(all_reference const& p)
            : a{p.a}
            , b{p.b}
            , c{p.c}
        all_copy(all_reference&& p)
            : a{ std::move(p.a) }
            , b{ std::move(p.b) }
            , c{ std::move(p.c) }

There needs to be a comparison function for std::sort . 需要为std::sort提供比较功能。 For some reason we have to provide all three. 由于某些原因,我们必须提供所有这三个。

    bool operator< (all_reference const& lhs, all_reference const& rhs)
        return lhs.c < rhs.c;
    bool operator< (all_reference const& lhs, all_copy const& rhs)
        return lhs.c < rhs.c;
    bool operator< (all_copy const& lhs, all_reference const& rhs)
        return lhs.c < rhs.c;

Now, the iterator class: 现在,迭代器类:

    struct all_iterator
        : public std::iterator < std::random_access_iterator_tag, all_copy >
        //+ specific to implementation
            using ItA = std::vector<double>::iterator;
            using ItB = std::vector<std::string>::iterator;
            using ItC = std::vector<int>::iterator;
            ItA iA;
            ItB iB;
            ItC iC;

            all_iterator(ItA a, ItB b, ItC c)
                : iA(a)
                , iB(b)
                , iC(c)
        //- specific to implementation

        //+ for iterator_traits
            using reference = all_reference;
            using pointer = all_reference;
        //- for iterator_traits

        //+ iterator requirement [iterator.iterators]/1
            all_iterator(all_iterator const&) = default;            // CopyConstructible
            all_iterator& operator=(all_iterator const&) = default; // CopyAssignable
            ~all_iterator() = default;                              // Destructible

            void swap(all_iterator& other)                          // lvalues are swappable
                std::swap(iA, other.iA);
                std::swap(iB, other.iB);
                std::swap(iC, other.iC);
        //- iterator requirements [iterator.iterators]/1
        //+ iterator requirement [iterator.iterators]/2
            all_reference operator*()
                return {*iA, *iB, *iC};
            all_iterator& operator++()
                return *this;
        //- iterator requirement [iterator.iterators]/2

        //+ input iterator requirements [input.iterators]/1
            bool operator==(all_iterator const& other) const        // EqualityComparable
                return iA == other.iA;  // should be sufficient (?)
        //- input iterator requirements [input.iterators]/1
        //+ input iterator requirements [input.iterators]/2
            bool operator!=(all_iterator const& other) const        // "UnEqualityComparable"
                return iA != other.iA;  // should be sufficient (?)

            all_reference const operator*() const                   // *a
                return {*iA, *iB, *iC};

            all_reference operator->()                              // a->m
                return {*iA, *iB, *iC};
            all_reference const operator->() const                  // a->m
                return {*iA, *iB, *iC};

            // ++r already satisfied

            all_iterator operator++(int)                            // *++r
                all_iterator temp(*this);
                return temp;
        //- input iterator requirements [input.iterators]/2

        //+ output iterator requirements [output.iterators]/1
            // *r = o already satisfied
            // ++r already satisfied
            // r++ already satisfied
            // *r++ = o already satisfied
        //- output iterator requirements [output.iterators]/1

        //+ forward iterator requirements [forward.iterators]/1
            all_iterator() = default;                               // DefaultConstructible
            // r++ already satisfied
            // *r++ already satisfied
            // multi-pass must be guaranteed
        //- forward iterator requirements [forward.iterators]/1

        //+ bidirectional iterator requirements [bidirectional.iterators]/1
            all_iterator& operator--()                              // --r
                return *this;
            all_iterator operator--(int)                            // r--
                all_iterator temp(*this);
                return temp;
            // *r-- already satisfied
        //- bidirectional iterator requirements [bidirectional.iterators]/1

        //+ random access iterator requirements [random.access.iterators]/1
            all_iterator& operator+=(difference_type p)             // r += n
                iA += p;
                iB += p;
                iC += p;
                return *this;
            all_iterator operator+(difference_type p) const         // a + n
                all_iterator temp(*this);
                temp += p;
                return temp;
            // doesn't have to be a friend function, but this way,
            // we can define it here
            friend all_iterator operator+(difference_type p,
                                         all_iterator temp)         // n + a
                temp += p;
                return temp;

            all_iterator& operator-=(difference_type p)             // r -= n
                iA -= p;
                iB -= p;
                iC -= p;
                return *this;
            all_iterator operator-(difference_type p) const         // a - n
                all_iterator temp(*this);
                temp -= p;
                return temp;

            difference_type operator-(all_iterator const& p)        // b - a
                return iA - p.iA;   // should be sufficient (?)

            all_reference operator[](difference_type p)             // a[n]
                return *(*this + p);
            all_reference const operator[](difference_type p) const // a[n]
                return *(*this + p);

            bool operator<(all_iterator const& p) const             // a < b
                return iA < p.iA;   // should be sufficient (?)
            bool operator>(all_iterator const& p) const             // a > b
                return iA > p.iA;   // should be sufficient (?)
            bool operator>=(all_iterator const& p) const            // a >= b
                return iA >= p.iA;  // should be sufficient (?)
            bool operator<=(all_iterator const& p) const            // a >= b
                return iA <= p.iA;  // should be sufficient (?)
        //- random access iterator requirements [random.access.iterators]/1
}//- namespace MyColumnType_iterator

MyColumnType::iterator MyColumnType::begin()
    return { a.begin(), b.begin(), c.begin() };
MyColumnType::iterator MyColumnType::end()
    return { a.end(), b.end(), c.end() };

Usage example: 用法示例:

#include <iostream>
#include <cstddef>
#include <algorithm>

namespace MyColumnType_iterator
    template < typename char_type, typename char_traits >
    std::basic_ostream < char_type, char_traits >&
    operator<< (std::basic_ostream < char_type, char_traits >& o,
                std::iterator_traits<MyColumnType::iterator>::reference p)
        return o << p.a << ";" << p.b << ";" << p.c;

int main()
    using std::cout;

    MyColumnType mct =
          {1.0, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1}
        , {"j", "i", "h", "g", "f", "e", "d", "c", "b", "a"}
        , {10,    9,   8,   7,   6,   5,   4,   3,   2,   1}

    using ref = std::iterator_traits<MyColumnType::iterator>::reference;
    std::copy(mct.begin(), mct.end(), std::ostream_iterator<ref>(cout, ", "));
    std::cout << std::endl;

    std::sort(mct.begin(), mct.end());
    std::copy(mct.begin(), mct.end(), std::ostream_iterator<ref>(cout, ", "));
    std::cout << std::endl;

Output: 输出:

1;j;10, 0.9;i;9, 0.8;h;8, 0.7;g;7, 0.6;f;6, 0.5;e;5, 0.4;d;4, 0.3;c;3, 0.2;b;2, 0.1;a;1, 1; j; 10,0.9; i; 9,0.8; h; 8,0.7; g; 7,0.6; f; 6,0.5; e; 5,0.4; d; 4,0.3; c; 3,0.2; b; 2,0.1; a; 1,
0.1;a;1, 0.2;b;2, 0.3;c;3, 0.4;d;4, 0.5;e;5, 0.6;f;6, 0.7;g;7, 0.8;h;8, 0.9;i;9, 1;j;10, 0.1; a; 1,0.2; b; 2,0.3; c; 3,0.4; d; 4,0.5; e; 5,0.6; f; 6,0.7; g; 7,0.8; h; 8,0.9; i; 9,1; j; 10,

If you're really concerned about performance and you want to sort your container with std::sort , use the overload that allows you to provide a custom comparison object: 如果您真的很在意性能,并且想要使用std::sort容器进行std::sort ,请使用允许您提供自定义比较对象的重载:

template <class RandomAccessIterator, class Compare>
void sort (RandomAccessIterator first, RandomAccessIterator last, Compare comp);

.. and sort an array of indices into the container. ..并将索引数组排序到容器中。 Here's how: 这是如何做:

You'll need the following members in your container: 您的容器中需要以下成员:

struct MyColumnType { 

    int size() const;

    // swaps columns
    void swap(int l, int r);

    // returns true if column l is less than column r
    bool less(int l, int r) const;


Then define the following comparison object: 然后定义以下比较对象:

struct MyColumnTypeLess
    const MyColumnType* container;
    MyColumnTypeLess(const MyColumnType* container)
        : container(container)
    bool operator()(int l, int r) const
        return container->less(l, r);

And use it to sort an array of indices: 并使用它对索引数组进行排序:

void sortMyColumnType(MyColumnType& container)
    std::vector<int> indices;
    // fill with [0, n)
    for(int i = 0; i != container.size(); ++i)
    // sort the indices
    std::sort(indices.begin(), indices.end(), MyColumnTypeLess(&container));

The 'less' member of the container controls which order to sort in: 容器的“少”成员控制排序顺序:

bool MyColumnType::less(int l, int r) const
    // sort first by a, then b, then c
    return a[l] != a[r] ? a[l] < a[r]
        : b[l] != b[r] ? b[l] < b[r]
        : c[l] < c[r];

The sorted array of indices can be used in further algorithms - you can avoid copying the actual data around until you need to. 索引的排序数组可以在其他算法中使用-您可以避免在需要之前避免复制实际数据。

All std algorithms that work with RandomAccessIterators have overloads that allow you to specify custom comparison objects, so they can also be used with this technique. 与RandomAccessIterator一起使用的所有std算法都有重载,允许您指定自定义比较对象,因此它们也可以与该技术一起使用。

