简体   繁体   English

地图,成对向量还是两个向量……?

[英]Map, pair-vector or two vectors…?

I read through some posts and "wikis" but still cannot decide what approach is suitable for my problem. 我通读了一些帖子和“ Wiki”,但仍无法确定哪种方法适合我的问题。

I create a class called Sample which contains a certain number of compounds (lets say this is another class Nuclide ) at a certain relative quantity (double). 我创建了一个名为Sample的类,其中包含一定数量的化合物(可以说这是Nuclide另一类),并且具有一定的相对数量(两倍)。

Thus, something like (pseudo): 因此,类似(pseudo):

class Sample {
    map<Nuclide, double>;
}

If I had the nuclides Ba-133 , Co-60 and Cs-137 in the sample, I would have to use exactly those names in code to access those nuclides in the map. 如果样本中有Ba-133Co-60Cs-137核素,则必须在代码中准确使用这些名称才能访问地图中的这些核素。 However, the only thing I need to do, is to iterate through the map to perform calculations (which nuclides they are is of no interest), thus, I will use a for- loop. 但是,我唯一需要做的就是遍历地图以执行计算(它们不关心哪个核素),因此,我将使用for循环。 I want to iterate without paying any attention to the key-names, thus, I would need to use an iterator for the map, am I right? 我想在不关注键名的情况下进行迭代,因此,我需要为地图使用迭代器,对吗?

An alternative would be a vector<pair<Nuclide, double> > 一个替代方法是vector<pair<Nuclide, double> >

class Sample {
    vector<pair<Nuclide, double> >;
}

or simply two independent vectors 或只是两个独立的向量

Class Sample {
    vector<Nuclide>;
    vector<double>;
}

while in the last option the link between a nuclide and its quantity would be "meta-information", given by the position in the respective vector only. 而在最后一个选项中,核素与其数量之间的联系将是“元信息”,仅由相应向量中的位置给出。

Due to my lack of profound experience, I'd ask kindly for suggestions of what approach to choose. 由于我缺乏丰富的经验,因此请提出关于选择哪种方法的建议。 I want to have the iteration through all available compounds to be fast and easy and at the same time keep the logical structure of the corresponding keys and values. 我想让所有可用化合物的迭代过程变得轻松快捷,同时保持相应键和值的逻辑结构。

PS.: It's possible that the number of compunds in a sample is very low (1 to 5)! PS .:样本中的compunds数量可能很低(1到5)! PPS.: Could the last option be modified by some const statements to prevent changes and thus keep the correct order? PPS .:是否可以通过某些const语句修改最后一个选项,以防止更改并因此保持正确的顺序?

If iteration needs to be fast, you don't want std::map<...> : its iteration is a tree-walk which quickly gets bad. 如果迭代需要快速,那么您就不需要std::map<...> :它的迭代是一条遍地开花的树,很快就会变糟。 std::map<...> is really only reasonable if you have many mutations to the sequence and you need the sequence ordered by the key. std::map<...>实际上只有在序列有很多突变并且需要按键排序的序列时才是合理的。 If you have mutations but you don't care about the order std::unordered_map<...> is generally a better alternative. 如果您有突变但不关心顺序std::unordered_map<...>通常是更好的选择。 Both kinds of maps assume you are looking things up by key, though. 不过,两种地图都假设您正在按键查找事物。 From your description I don't really see that to be the case. 从您的描述来看,我并不是真的如此。

std::vector<...> is fast to iterated. std::vector<...>快速迭代。 It isn't ideal for look-ups, though. 但是,它对于查找不是理想的。 If you keep it ordered you can use std::lower_bound() to do a std::map<...> -like look-up (ie, the complexity is also O(log n) ) but the effort of keeping it sorted may make that option too expensive. 如果将其保持有序,则可以使用std::lower_bound()进行std::map<...>类似的查找(即,复杂度也是O(log n) ),但是要努力保持它排序可能会使该选项过于昂贵。 However, it is an ideal container for keeping a bunch objects together which are iterated. 但是,它是用于将一堆被迭代的对象保持在一起的理想容器。

Whether you want one std::vector<std::pair<...>> or rather two std::vector<...> s depends on your what how the elements are accessed: if both parts of an element are bound to be accessed together, you want a std::vector<std::pair<...>> as that keeps data which is accessed together. 是否需要一个std::vector<std::pair<...>>还是两个std::vector<...> s取决于您如何访问元素:如果元素的两个部分都已绑定要一起访问,您需要一个std::vector<std::pair<...>>因为这样可以保留一起访问的数据。 On the other hand, if you normally only access one of the two components, using two separate std::vector<...> s will make the iteration faster as more iteration elements fit into a cache-line, especially if they are reasonably small like double s. 另一方面,如果您通常仅访问两个组件之一,则使用两个单独的std::vector<...> s将使迭代速度加快,因为更多的迭代元素适合于高速缓存行,尤其是当它们合理时小如double s。

In any case, I'd recommend to not expose the external structure to the outside world and rather provide an interface which lets you change the underlying representation later. 无论如何,我建议不要将外部结构暴露给外界,而应该提供一个接口,让您以后可以更改基础表示形式。 That is, to achieve maximum flexibility you don't want to bake the representation into all your code. 也就是说,为了获得最大的灵活性,您不想将表示形式烘烤到所有代码中。 For example, if you use accessor function objects ( property maps in terms of BGL or projections in terms of Eric Niebler's Range Proposal) to access the elements based on an iterator, rather than accessing the elements you can change the internal layout without having to touch any of the algorithms (you'll need to recompile the code, though): 例如,如果您使用访问器函数对象(以BGL表示 属性图或以Eric Niebler的Range Proposal表示投影 )来访问基于迭代器的元素,而不是访问元素,则无需触摸即可更改内部布局任何算法(不过,您都需要重新编译代码):

// version using std::vector<std::pair<Nuclide, double> >
// - it would just use std::vector<std::pair<Nuclide, double>::iterator as iterator
auto nuclide_projection = [](Sample::key& key) -> Nuclide& {
    return key.first;
}
auto value_projecton = [](Sample::key& key) -> double {
    return key.second;
}

// version using two std::vectors:
// - it would use an iterator interface to an integer, yielding a std::size_t for *it
struct nuclide_projector {
    std::vector<Nuclide>& nuclides;
    auto operator()(std::size_t index) -> Nuclide& { return nuclides[index]; }
};
constexpr nuclide_projector nuclide_projection;
struct value_projector {
    std::vector<double>& values;
    auto operator()(std::size_t index) -> double& { return values[index]; }
};
constexpr value_projector value_projection;

With one pair these in-place, for example an algorithm simply running over them and printing them could look like this: 有了这些就位,例如,仅在它们之上运行并打印它们的算法可能如下所示:

template <typename Iterator>
void print(std::ostream& out, Iterator begin, Iterator end) {
    for (; begin != end; ++begin) {
         out << "nuclide=" << nuclide_projection(*begin) << ' '
             << "value=" << value_projection(*begin) << '\n';
    }
}

Both representations are entirely different but the algorithm accessing them is entirely independent. 两种表示完全不同,但是访问它们的算法完全独立。 This way it is also easy to try different representations: only the representation and the glue to the algorithms accessing it need to be changed. 这样,尝试不同的表示形式也很容易:只需更改表示形式和访问它的算法的粘合即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM