简体   繁体   English

std :: remove和擦除向量之间的区别?

[英]Difference between std::remove and erase for vector?

I have a doubt that I would like to clarify in my head. 我怀疑我想澄清一下。 I am aware of the different behavior for std::vector between erase and std::remove where the first physically removes an element from the vector, reducing size, and the other just moves an element leaving the capacity the same. 我知道erasestd::remove之间std::vector的不同行为,其中第一个物理上从向量中删除元素,从而减小大小,而另一个仅移动元素而使容量不变。

Is this just for efficiency reasons? 这仅仅是出于效率原因吗? By using erase , all elements in a std::vector will be shifted by 1, causing a large amount of copies; 通过使用erasestd::vector中的所有元素将移位1,从而导致大量复制; std::remove does just a 'logical' delete and leaves the vector unchanged by moving things around. std::remove只是执行“逻辑”删除操作,并且通过四处移动来使向量保持不变。 If the objects are heavy, that difference might matter, right? 如果物体很重,这种差异可能很重要,对不对?

Is this just for efficiency reason? 这仅仅是出于效率原因吗? By using erase all elements in a std::vector will be shifted by 1 causing a large amount of copies; 通过使用擦除,std :: vector中的所有元素都将移位1,从而导致大量复制; std::remove does just a 'logical' delete and leaves the vector unchanged by moving things around. std :: remove只是执行“逻辑”删除操作,并且通过四处移动来使向量保持不变。 If the objects are heavy that difference mihgt matter, right? 如果物体很重,差异会很重要,对不对?

The reason for using this idiom is exactly that. 使用这个习惯用法的原因正是这样。 There is a benefit in performance, but not in the case of a single erasure. 性能上有好处,但单次擦除则无济于事。 Where it does matter is if you need to remove multiple elements from the vector. 重要的是您是否需要从向量中删除多个元素。 In this case, the std::remove will copy each not removed element only once to its final location, while the vector::erase approach would move all of the elements from the position to the end multiple times. 在这种情况下, std::remove将仅将每个未删除的元素复制一次到其最终位置,而vector::erase方法会将所有元素从该位置多次移动到末尾。 Consider: 考虑:

std::vector<int> v{ 1, 2, 3, 4, 5 };
// remove all elements < 5

If you went over the vector removing elements one by one, you would remove the 1, causing copies of the remainder elements that get shifted (4). 如果逐个遍历矢量删除元素,则会删除1,导致其余元素的副本移位(4)。 Then you would remove 2 and shift all remainding elements by one (3)... if you see the pattern this is a O(N^2) algorithm. 然后,您将删除2并将所有剩余元素移动一(3)...如果看到模式,则这是O(N^2)算法。

In the case of std::remove the algorithm maintains a read and write heads, and iterates over the container. std::remove的情况下,该算法将维护读写头,并在容器上进行迭代。 For the first 4 elements the read head will be moved and the element tested, but no element is copied. 对于前四个元素,将移动读取头并测试该元素,但不复制任何元素。 Only for the fifth element the object would be copied from the last to the first position, and the algorithm will complete with a single copy and returning an iterator to the second position. 仅对于第五个元素,该对象将从最后一个位置复制到第一个位置,并且算法将完成一个副本并将迭代器返回到第二个位置。 This is a O(N) algorithm. 这是一个O(N)算法。 The later std::vector::erase with the range will cause destruction of all the remainder elements and resizing the container. 后面带有范围的std::vector::erase将导致破坏所有其余元素并调整容器的大小。

As others have mentioned, in the standard library algorithms are applied to iterators, and lack knowledge of the sequence being iterated. 正如其他人提到的那样,在标准库中,算法被应用于迭代器,并且缺乏对要迭代的序列的了解。 This design is more flexible than other approaches on which algorithms are aware of the containers in that a single implementation of the algorithm can be used with any sequence that complies with the iterator requirements. 这种设计比其他算法可以识别容器的方法更加灵活,因为该算法的单个实现可以与符合迭代器要求的任何序列一起使用。 Consider for example, std::remove_copy_if , it can be used even without containers, by using iterators that generate/accept sequences: 例如,考虑一下std::remove_copy_if ,即使使用无容器,也可以使用生成/接受序列的迭代器来使用它:

std::remove_copy_if(std::istream_iterator<int>(std::cin),
                    std::istream_iterator<int>(),
                    std::ostream_iterator<int>(std::cout, " "),
                    [](int x) { return !(x%2); } // is even
                    );

That single line of code will filter out all even numbers from standard input and dump that to standard output, without requiring the loading of all numbers into memory in a container. 单行代码将过滤掉标准输入中的所有偶数,并将其转储到标准输出,而无需将所有数字加载到容器的内存中。 This is the advantage of the split, the disadvantage is that the algorithms cannot modify the container itself, only the values referred to by the iterators. 这是拆分的优点,缺点是算法无法修改容器本身,只能修改迭代器引用的值。

std::remove is an algorithm from the STL which is quite container agnostic. std::remove是STL中的一种算法,与容器无关。 It requires some concept, true, but it has been designed to also work with C arrays, which are static in sizes. 它需要一些概念,这是正确的,但已设计为还可以使用大小固定的C数组。

std::remove simply returns a new end() iterator to point to one past the last non-removed element (the number of items from the returned value to end() will match the number of items to be removed, but there is no guarantee their values are the same as those you were removing - they are in a valid but unspecified state). std::remove简单地返回一个新的end()迭代器,以指向最后一个未删除的元素之后的元素(从返回值到end()的项目数将与要删除的项目数匹配,但是没有确保它们的值与您要删除的值相同-它们处于有效但未指定的状态)。 This is done so that it can work for multiple container types (basically any container type that a ForwardIterator can iterate through). 这样做是为了使其可以用于多种容器类型(基本上是ForwardIterator可以迭代通过的任何容器类型)。

std::vector::erase actually sets the new end() iterator after adjusting the size. 调整大小后, std::vector::erase实际上会设置新的end()迭代器。 This is because the vector 's method actually knows how to handle adjusting it's iterators (the same can be done with std::list::erase , std::deque::erase , etc.). 这是因为vector的方法实际上知道如何调整其迭代器(可以使用std::list::erasestd::deque::erase等来完成此操作)。

remove organizes a given container to remove unwanted objects. remove组织给定的容器以删除不需要的对象。 The container's erase function actually handles the "removing" the way that container needs it to be done. 容器的擦除功能实际上处理容器需要完成的“删除”操作。 That is why they are separate. 这就是为什么它们是分开的。

I think it has to do with needing direct access to the vector itself to be able to resize it. 我认为这与需要直接访问向量本身有关,以便能够调整其大小。 std::remove only has access to the iterators, so it has no way of telling the vector "Hey, you now have fewer elements". std :: remove仅具有访问迭代器的权限,因此它无法告诉向量“嘿,您现在的元素更少了”。

See yves Baumes answer as to why std::remove is designed this way. 请参阅yves Baumes关于为何以这种方式设计std :: remove的答案。

Yes, that's the gist of it. 是的,这就是要点。 Note that erase is also supported by the other standard containers where its performance characteristics are different (eg list::erase is O(1)), while std::remove is container-agnostic and works with any type of forward iterator (so it works for eg bare arrays as well). 请注意,其他erase性能也有所不同的其他标准容器也支持erase (例如list :: erase为O(1)),而std::remove则与容器无关,并且可以与任何类型的正向迭代器一起使用 (因此它也适用于裸阵列)。

Kind of. 的种类。 Algorithms such as remove work on iterators (which are an abstraction to represent an element in a collection) which do not necessarily know which type of collection they are operating on - and therefore cannot call members on the collection to do the actual removal. 诸如remove的算法在迭代器(代表集合中元素的抽象)上工作,这些迭代器不一定知道它们在操作哪种集合类型,因此无法调用集合中的成员进行实际的删除。

This is good because it allows algorithms to work generically on any container and also on ranges that are subsets of the entire collection. 这很好,因为它允许算法在任何容器上以及在整个集合的子集范围内通用地工作。

Also, as you say, for performance - it may not be necessary to actually remove (and destroy) the elements if all you need is access to the logical end position to pass on to another algorithm. 而且,正如您所说,为了提高性能,如果您需要的只是访问逻辑端点位置并传递给其他算法,则实际上不必删除(并销毁)元素。

Standard library algorithms operate on sequences . 标准库算法对序列进行操作。 A sequence is defined by a pair of iterators; 一个序列是由一对迭代器定义的。 the first points at the first element in the sequence, and the second points one-past-the-end of the sequence. 第一个指向序列中的第一个元素,第二个指向序列中的最后一个。 That's all; 就这样; algorithms don't care where the sequence comes from. 算法不在乎序列的来源。

Standard library containers hold data values, and provide a pair of iterators that specify a sequence for use by algorithms. 标准库容器保存数据值,并提供一对迭代器,这些迭代器指定算法使用的序列。 They also provide member functions that may be able to do the same operations as an algorithm more efficiently by taking advantage of the internal data structure of the container. 它们还提供成员函数,这些成员函数可以利用容器的内部数据结构来更有效地执行与算法相同的操作。

Try following code to get better understanding. 尝试以下代码以获得更好的理解。

std::vector<int> v = {1, 2, 3, 4, 5, 6, 7, 8};
const auto newend (remove(begin(v), end(v), 2));

for(auto a : v){
    cout << a << " ";
}
cout << endl;
v.erase(newend, end(v));
for(auto a : v){
    cout << a << " ";
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM