简体   繁体   English

在 STL 集合和自定义比较函数中使用结构体时避免复制

[英]Avoiding copies when using structs in STL sets and custom comparison function

Imagine you have a struct with a bunch of members, and you want to use a particular value referenced via one of its members as the key in a set, like so:假设您有一个包含一堆成员的结构,并且您想使用通过其中一个成员引用的特定值作为集合中的键,如下所示:

class ComplexClass {
 public:
  const string& name() const;
  // tons of other stuff
};
struct MyStruct {
  ComplexClass* c;
  MoreStuff* x;
};
struct CmpMyStruct {
  bool operator()(const MyStruct& lhs, const MyStruct& rhs) {
    return lhs.c->name() < rhs.c->name();
  }
};
typedef set<MyStruct, CmpMyStruct> MySet;
MySet my_set;

This works just fine, however now I'd like to do a lookup by a string name , but my_set.find() now takes of course a 'const MyStruct&'.这工作得很好,但是现在我想通过字符串 name进行查找,但是 my_set.find() 现在当然需要一个“const MyStruct&”。 If the name wasn't taken out of that ComplexClass but was instead a member of MyStruct, I could just quickly fake up an instance of MyStruct and use that:如果名称不是从那个 ComplexClass 中取出而是一个 MyStruct 的成员,我可以快速伪造一个 MyStruct 的实例并使用它:

MyStruct tmp_for_lookup;
tmp_for_lookup.name = "name_to_search";  // Doesn't work of course
MySet::iterator iter =  my_set.find(tmp_for_lookup);

However, as said, that's not how it works, the name is in ComplexClass, so I'd have to at least put a mock of that in there or something.但是,如上所述,它不是这样工作的,名称在 ComplexClass 中,所以我至少必须在那里放一个模拟或其他东西。

So what I'd actually want is that the STL set wouldn't compare MyStructs but rather first "project" the key out of the MyStruct (which has type string), and then do its operations, including find(), on that.所以我真正想要的是 STL 集不会比较 MyStructs,而是首先从 MyStruct(它具有字符串类型)中“投影”出密钥,然后对其进行操作,包括 find()。 I started digging into the implementation of set/map in gcc to see how they solved that problem for the map, and was saddened to see that they actually solved it in the internal _Rb_tree, but didn't expose it, since it's not part of the standard.我开始深入研究 gcc 中 set/map 的实现,看看他们是如何解决 map 的这个问题的,并且很伤心地看到他们实际上在内部 _Rb_tree 中解决了它,但没有公开它,因为它不是标准。 From gcc's stl_tree.h:来自 gcc 的 stl_tree.h:

template<typename _Key, typename _Val, typename _KeyOfValue,
         typename _Compare, typename _Alloc = allocator<_Val> >
  class _Rb_tree
  {
....
template<typename _Key, typename _Val, typename _KeyOfValue,
         typename _Compare, typename _Alloc>
  typename _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::iterator
  _Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::
  find(const _Key& __k)

And then in the stl_map.h:然后在 stl_map.h 中:

    typedef _Rb_tree<key_type, value_type, _Select1st<value_type>,
             key_compare, _Pair_alloc_type> _Rep_type;

Note how it uses '_Select1st' to project the key out of the value_type, so find() can actually just work with the key.请注意它如何使用 '_Select1st' 将键从 value_type 中投影出来,因此 find() 实际上可以只使用键。 On the other hand the stl_set.h just uses the Identity in that case, as expected.另一方面, stl_set.h 在这种情况下只使用 Identity,正如预期的那样。

So I was wondering, is there a way that I'm currently missing for how I could achieve the same beauty & efficiency with the normal STL sets/maps (ie I absolutely don't want to use the GCC-specific _Rb_tree directly), such that I could really just do所以我想知道,是否有一种方法我目前缺少如何使用普通 STL 集/贴图实现相同的美观和效率(即我绝对不想直接使用 GCC 特定的 _Rb_tree),这样我真的可以做

MySet::iterator iter = my_set.find("somestring");

Note that I specifically don't want to change my_set to be a map from strings to MyStructs, ie I do not want to copy the string (or a reference to it) out of the ComplexClass just so I could do map<string, MyStruct> or map<const string&, MyStruct> instead.请注意,我特别不想将 my_set 更改为从字符串到 MyStructs 的映射,即我不想将字符串(或对它的引用)从 ComplexClass 中复制出来,这样我就可以执行map<string, MyStruct>map<const string&, MyStruct>代替。

This is almost more of a thought-exercise at this point, but seemed interesting :)在这一点上,这几乎更像是一种思想练习,但似乎很有趣:)

now I'd like to do a lookup by a string name, but my_set.find() now takes of course a 'const MyStruct&'.现在我想通过字符串名称进行查找,但是 my_set.find() 现在当然需要一个“const MyStruct&”。

This is a well known deficiency of the interface of std::set .这是std::set接口的一个众所周知的缺陷。

boost::multi_index with ordered_unique index provides the interface of std::set with extra find() functions that take a comparable key, rather than a whole set element.带有ordered_unique索引的boost::multi_indexstd::set的接口提供了额外的find()函数,这些函数采用一个可比较的键,而不是一个完整的集合元素。

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>

struct ComplexClass {
    std::string name() const;

    struct KeyName {
        typedef std::string result_type;
        std::string const& operator()(ComplexClass const& c) const {
            return c.name();
        }
    };
};

namespace mi = boost::multi_index;

typedef mi::multi_index_container<
      ComplexClass
    , mi::indexed_by<
            mi::ordered_unique<ComplexClass::KeyName>
          >
    > ComplexClassSet;

int main() {
    ComplexClassSet s;
    // fill the set
    // ...
    // now search by name
    ComplexClassSet::iterator found = s.find("abc");
    if(found != s.end()) {
        // found an element whose name() == "abc"
    }
}

See http://www.boost.org/doc/libs/1_52_0/libs/multi_index/doc/tutorial/key_extraction.html for more details.有关更多详细信息,请参阅http://www.boost.org/doc/libs/1_52_0/libs/multi_index/doc/tutorial/key_extraction.html

If you can handle the overhead of a virtual call on comparisson, you could use a trick like this:如果您可以处理比较虚拟调用的开销,您可以使用这样的技巧:

class ComplexClass {
 public:
  const string& name() const;
  // tons of other stuff
};
struct MyStruct {
  ComplexClass* c;
  MoreStuff* x;
  virtual const string& key() const { return c->name(); } /* change */
};
struct CmpMyStruct {
  bool operator()(const MyStruct& lhs, const MyStruct& rhs) {
    return lhs.key() < rhs.key(); /* change */
  }
};
typedef set<MyStruct, CmpMyStruct> MySet;
MySet my_set;

Then, for lookup, add the following struct:然后,为了查找,添加以下结构:

struct MyLookupStruct : MyStruct {
  string search_key;
  explicit MyLookupStruct(const string& key) : search_key(key) {}
  virtual const string& key() const { return search_key; }
};
/* .... */
MySet::iterator iter =  my_set.find(MyLookupStruct("name to find"));

This depends on std::set<>::find not making a copy of the argument, which seems a reasonable assumption (but to my knowledge not explicitly guaranteed).这取决于std::set<>::find没有制作参数的副本,这似乎是一个合理的假设(但据我所知并没有明确保证)。

AFAIK, there's no way with the C++ standard library's std::set only, but you can use std::find_if with a predicate AFAIK,没有办法只使用 C++ 标准库的std::set ,但是你可以使用带有谓词的std::find_if

struct cmp_for_lookup {
    std::string search_for;
    cmp_for_lookup(const std::string &s) : search_for(s) {}
    bool operator()(const MyStruct &elem) { return elem.c->name() == search_for; }
};

std::find_if(my_set.begin(), my_set.end(), cmp_for_lookup("name_to_search"));

Obviously, since boost is out of the question, and std::set is not intended to provide multiple indexes, you're bound to build your own extra index.显然,由于 boost 是不可能的,并且std::set不打算提供多个索引,因此您必须构建自己的额外索引。

std::string name_of_struct(const MyStruct* s){
   return s->c->name();
}


...
std::map<std::string, MyStruct*> indexByName;
std::transform( my_set.begin(), my_set.end(), 
                std::inserter(indexByName,indexByName.begin()),
                &name_of_struct );

 ...
 MyStruct* theFoundStruct=indexByName("theKey");

(note: the idea will work, but the compiler will still complain since this is top-of-my-head). (注意:这个想法会奏效,但编译器仍然会抱怨,因为这是我的头等大事)。

Remember that the keys in an associative container ( set or map ) are const , because they are sorted as they are inserted and cannot be re-sorted later.请记住,关联容器( setmap )中的键是const ,因为它们在插入时已排序,以后无法重新排序。 So any solution that actually references the inserted object will be vulnerable to the possibility that you change the key member after inserting.因此,任何实际引用插入对象的解决方案都容易受到您在插入后更改关键成员的可能性的影响。

So, the most general solution is to use a map and copy the key member.因此,最通用的解决方案是使用map并复制关键成员。

If the keys won't change, then you can use pointers as keys.如果键不会改变,那么您可以使用指针作为键。 Lookup using a temporary will require getting a pointer to temporary, but that's OK as find and operator[] won't keep a copy of their argument.使用临时的查找将需要获得一个指向临时的指针,但这没关系,因为findoperator[]不会保留其参数的副本。

#include <map>
#include <string>

/* Comparison functor for maps that don't own keys. */
struct referent_compare {
    template< typename t >
    bool operator () ( t *lhs, t *rhs )
        { return * lhs < * rhs; }
};

/* Convenience function to get a const pointer to a temporary. */
template< typename t >
t const *temp_ptr( t const &o ) { return &o; }

struct s {
    std::string name;
    long birthdate;
    double score;
};

/* Maps that don't own keys.
   Generalizing this to a template adapter left as an exercise. */
std::map< std::string const *, s, referent_compare > byname;
std::map< double const *, s, referent_compare > byscore;

int main() {
    s bob = { "bob", 12000000, 96.3 };
    byname.insert( std::make_pair( & bob.name, bob ) );
    byscore.insert( std::make_pair( & bob.score, bob ) );

    byname[ temp_ptr< std::string >( "bob" ) ].score = 33.1;
}

Of course, another solution is to use a set of pointers and define the comparison function to access the member.当然,另一种解决方案是使用一set指针并定义比较函数来访问成员。 Pointer-to-members can be used to generalize that so the key member is merely a template parameter to the comparison functor.指向成员的指针可用于概括它,因此关键成员只是比较函子的模板参数。

http://ideone.com/GB2jfn http://ideone.com/GB2jfn

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM