[英]C++ std::vector<>::iterator is not a pointer, why?
Just a little introduction, with simple words.只是一点介绍,用简单的词。 In C++, iterators are "things" on which you can write at least the dereference operator
*it
, the increment operator ++it
, and for more advanced bidirectional iterators, the decrement --it
, and last but not least, for random access iterators we need operator index it[]
and possibly addition and subtraction.在 C++ 中,迭代器是“事物”,您至少可以在上面编写解引用运算符
*it
、增量运算符++it
,对于更高级的双向迭代器,减量--it
以及最后但并非最不重要的随机访问迭代器我们需要操作符索引it[]
以及可能的加法和减法。
Such "things" in C++ are objects of types with the according operator overloads, or plain and simple pointers. C++ 中的这些“事物”是具有相应运算符重载的类型的对象,或简单而简单的指针。
std::vector<>
is a container class that wraps a continuous array, so pointer as iterator makes sense. std::vector<>
是一个包装连续数组的容器类,因此指针作为迭代器是有意义的。 On the nets, and in some literature you can find vector.begin()
used as a pointer.在网络上和某些文献中,您可以找到用作指针的
vector.begin()
。
The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff).使用指针的基本原理是更少的开销,更高的性能,特别是如果优化编译器检测到迭代并执行它的操作(向量指令和东西)。 Using iterators might be harder for the compiler to optimize.
编译器可能更难优化使用迭代器。
Knowing this, my question is why modern STL implementations, let's say MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector iterators?知道了这一点,我的问题是为什么现代 STL 实现,比如说 MSVC++ 2013 或 Mingw 4.7 中的 libstdc++,对向量迭代器使用特殊的类?
You're completely correct that vector::iterator
could be implemented by a simple pointer (see here ) -- in fact the concept of an iterator is based on that of a pointer to an array element.您完全正确地认为
vector::iterator
可以通过一个简单的指针来实现(参见 此处)——事实上,迭代器的概念基于指向数组元素的指针的概念。 For other containers, such as map
, list
, or deque
, however, a pointer won't work at all.但是,对于其他容器,例如
map
、 list
或deque
,指针根本不起作用。 So why is this not done?那么为什么没有这样做呢? Here are three reasons why a class implementation is preferrable over a raw pointer.
以下是类实现优于原始指针的三个原因。
Implementing an iterator as separate type allows additional functionality (beyond what is required by the standard), for example ( added in edit following Quentins comment ) the possibility to add assertions when dereferencing an iterator, for example, in debug mode.将迭代器实现为单独的类型允许附加功能(超出标准要求的功能),例如(在 Quentins 评论后的编辑中添加)在取消引用迭代器时添加断言的可能性,例如,在调试模式下。
overload resolution If the iterator were a pointer T*
, it could be passed as valid argument to a function taking T*
, while this would not be possible with an iterator type.重载解析如果迭代器是一个指针
T*
,它可以作为有效参数传递给一个采用T*
的函数,而这对于迭代器类型是不可能的。 Thus making std::vector<>::iterator
a pointer in fact changes the behaviour of existing code.因此,使
std::vector<>::iterator
成为指针实际上改变了现有代码的行为。 Consider, for example,考虑,例如,
template<typename It> void foo(It begin, It end); void foo(const double*a, const double*b, size_t n=0); std::vector<double> vec; foo(vec.begin(), vec.end()); // which foo is called?
argument-dependent lookup (ADL; pointed out by juanchopanza) If you make an unqualified call, ADL ensures that functions in namespace std
will be searched only if the arguments are types defined in namespace std
.参数相关查找(ADL;由 juanchopanza 指出)如果您进行非限定调用,ADL 确保仅当参数是
namespace std
定义的类型时,才会搜索namespace std
中的函数。 So,所以,
std::vector<double> vec; sort(vec.begin(), vec.end()); // calls std::sort sort(vec.data(), vec.data()+vec.size()); // fails to compile
std::sort
is not found if vector<>::iterator
were a mere pointer.如果
vector<>::iterator
只是一个指针,则找不到std::sort
。
The implementation of the iterator is implementation defined , so long as fulfills the requirements of the standard.迭代器的实现是实现定义的,只要满足标准的要求。 It could be a pointer for
vector
, that would work.它可能是
vector
的指针,那会起作用。 There are several reasons for not using a pointer;不使用指针有几个原因;
If all the iterators were pointers, then ++it
on a map
would not increment it to the next element since the memory is not required to be not-contiguous.如果所有迭代器都是指针,那么
map
上的++it
不会将其递增到下一个元素,因为内存不需要是不连续的。 Past the contiguous memory of std:::vector
most standard containers require "smarter" pointers - hence iterators.除了
std:::vector
的连续内存之外,大多数标准容器都需要“更智能”的指针——因此需要迭代器。
The physical requirement's of the iterator dove-tail very well with the logical requirement that movement between elements it a well defined "idiom" of iterating over them, not just moving to the next memory location.迭代器的物理要求与逻辑要求非常吻合,即元素之间的移动是迭代它们的明确定义的“习惯用法”,而不仅仅是移动到下一个内存位置。
This was one of the original design requirements and goals of the STL;这是 STL 最初的设计要求和目标之一; the orthogonal relationship between the containers, the algorithms and connecting the two through the iterators.
容器之间的正交关系,算法和通过迭代器连接两者。
Now that they are classes, you can add a whole host of error checking and sanity checks to debug code (and then remove it for more optimised release code).现在它们是类,您可以添加大量错误检查和健全性检查来调试代码(然后将其删除以获得更优化的发布代码)。
Given the positive aspects class based iterators bring, why should or should you not just use pointers for std::vector
iterators - consistency.鉴于基于类的迭代器带来的积极方面,为什么应该或不应该只使用
std::vector
迭代器的指针 - 一致性。 Early implementations of std::vector
did indeed use plain pointers, you can use them for vector
. std::vector
早期实现确实使用了普通指针,您可以将它们用于vector
。 Once you have to use classes for the other iterators, given the positives they bring, applying that to vector
becomes a good idea.一旦您必须将类用于其他迭代器,考虑到它们带来的积极影响,将其应用于
vector
成为一个好主意。
The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff).
使用指针的基本原理是更少的开销,更高的性能,特别是如果优化编译器检测到迭代并执行它的操作(向量指令和东西)。 Using iterators might be harder for the compiler to optimize.
编译器可能更难优化使用迭代器。
It might be, but it isn't.可能是,但不是。 If your implementation is not utter shite, a struct wrapping a pointer will achieve the same speed.
如果您的实现不完全是狗屎,包装指针的结构将达到相同的速度。
With that in mind, it's simple to see that simple benefits like better diagnostic messages (naming the iterator instead of T*), better overload resolution, ADL, and debug checking make the struct a clear winner over the pointer.考虑到这一点,很容易看到简单的好处,比如更好的诊断消息(命名迭代器而不是 T*)、更好的重载解析、ADL 和调试检查,使结构明显胜过指针。 The raw pointer has no advantages.
原始指针没有优势。
The rationale for using a pointer is less overhead, higher performance, especially if an optimizing compiler detects iteration and does its thing (vector instructions and stuff).
使用指针的基本原理是更少的开销,更高的性能,特别是如果优化编译器检测到迭代并执行它的操作(向量指令和东西)。 Using iterators might be harder for the compiler to optimize.
编译器可能更难优化使用迭代器。
This is the misunderstanding at the heart of the question.这是问题核心的误解。 A well formed class implementation will have no overhead, and identical performance all because the compiler can optimize away the abstraction and treat the iterator class as just a pointer in the case of
std::vector
.格式良好的类实现将没有开销,并且性能相同,因为编译器可以优化抽象并将迭代器类视为
std::vector
情况下的指针。
That said,那说,
MSVC++ 2013 or libstdc++ in Mingw 4.7, use a special class for vector iterators
MSVC++ 2013 或 Mingw 4.7 中的 libstdc++,为向量迭代器使用一个特殊的类
because they view that adding a layer of abstraction class iterator
to define the concept of iteration over a std::vector
is more beneficial than using an ordinary pointer for this purpose.因为他们认为在
std::vector
上添加一层抽象class iterator
来定义迭代的概念比为此目的使用普通指针更有益。
Abstractions have a different set of costs vs benefits, typically added design complexity (not necessarily related to performance or overhead) in exchange for flexibility, future proofing, hiding implementation details.抽象具有不同的成本与收益,通常会增加设计复杂性(不一定与性能或开销相关),以换取灵活性、面向未来、隐藏实现细节。 The above compilers decided this added complexity is an appropriate cost to pay for the benefits of having an abstraction.
上述编译器认为这种增加的复杂性是为获得抽象的好处而付出的适当成本。
Because STL was designed with the idea that you can write something that iterates over an iterator, no matter whether that iterator's just equivalent to a pointer to an element of memory-contiguous arrays (like std::array
or std::vector
) or something like a linked list, a set of keys, something that gets generated on the fly on access etc.因为 STL 的设计理念是,您可以编写对迭代器进行迭代的内容,无论该迭代器是否仅等效于指向内存连续数组(如
std::array
或std::vector
)元素的指针或其他东西像一个链表、一组键、一些在访问时动态生成的东西等。
Also, don't be fooled: In the vector case, dereferencing might (without debug options) just break down to a inlinable pointer dereference, so there wouldn't even be overhead after compilation!另外,不要被愚弄:在向量情况下,取消引用可能(没有调试选项)只会分解为可内联的指针取消引用,因此编译后甚至不会产生开销!
I think the reason is plain and simple: originally std::vector
was not required to be implemented over contiguous blocks of memory.我认为原因很简单:最初不需要在连续的内存块上实现
std::vector
。 So the interface could not just present a pointer.所以接口不能只呈现一个指针。
source: https://stackoverflow.com/a/849190/225186来源: https : //stackoverflow.com/a/849190/225186
This was fixed later and std::vector
was required to be in contiguous memory, but it was probably too late to make std::vector<T>::iterator
a pointer.这后来被修复,并且
std::vector
需要在连续内存中,但让std::vector<T>::iterator
成为指针可能为时已晚。 Maybe some code already depended on iterator
to be a class/struct
.也许一些代码已经依赖
iterator
成为一个class/struct
。
Interestingly, I found implementations of std::vector<T>::iterator
where this is valid and generated a "null" iterators (just like a null pointer) it = {};
有趣的是,我发现
std::vector<T>::iterator
是有效的,并生成了一个“空”迭代器(就像一个空指针) it = {};
. .
std::vector<double>::iterator it = {};
assert( &*it == nullptr );
Also, std::array<T>::iterator
and std::initializer_list<T>::iterator
are pointers T*
in the implementations I saw.此外,
std::array<T>::iterator
和std::initializer_list<T>::iterator
是我看到的实现中的指针T*
。
A plain pointer as std::vector<T>::iterator
would be perfectly fine in my opinion, in theory.理论上,像
std::vector<T>::iterator
这样的普通指针在我看来是完全没问题的。 In practice, being a built-in has observable effects for metaprogramming, (eg std::vector<T>::iterator::difference_type
wouldn't be valid, yes, one should have used iterator_traits
).在实践中,作为内置
std::vector<T>::iterator::difference_type
对元编程具有明显的影响(例如std::vector<T>::iterator::difference_type
将无效,是的,应该使用iterator_traits
)。
Not-being a raw pointer has the (very) marginal advantage of disallowing nullability ( it == nullptr
) or default conductibility if you are into that.不是原始指针具有(非常)边际优势,即不允许可空性(
it == nullptr
)或默认可导性(如果您对此it == nullptr
)。 (an argument that doesn't matter for a generic programming point of view.) (对于通用编程的观点来说无关紧要的论点。)
At the same time the dedicated class iterators had a steep cost in other metaprogramming aspects, because if ::iterator
were a pointer one wouldn't need to have ad hoc methods to detect contiguous memory (see contiguous_iterator_tag
in https://en.cppreference.com/w/cpp/iterator/iterator_tags ) and generic code over vectors could be directly forwarded to legacy C-functions.同时,专用类迭代器必须在其他方面的元编程陡峭的成本,因为如果
::iterator
是一个指针一个不需要有特设的方法来检测连续内存(见contiguous_iterator_tag
在HTTPS://en.cppreference .com/w/cpp/iterator/iterator_tags )和向量上的通用代码可以直接转发到遗留 C 函数。 For this reason alone I would argue that iterator-not-being-a-pointer was a costly mistake.仅出于这个原因,我认为迭代器不是指针是一个代价高昂的错误。 It just made it hard to interact with C-code (as you need another layer of functions and type detection to safely forward stuff to C).
它只是让与 C 代码交互变得困难(因为您需要另一层函数和类型检测来安全地将内容转发到 C)。
Having said this, I think we could still make things better by allowing automatic conversions from iterators to pointers and perhaps explicit (?) conversions from pointer to vector::iterators.话虽如此,我认为我们仍然可以通过允许从迭代器到指针的自动转换以及从指针到 vector::iterator 的显式 (?) 转换来使事情变得更好。
I got around this pesky obstacle by dereferencing and immediately referencing the iterator again.我通过取消引用并立即再次引用迭代器来绕过这个讨厌的障碍。 It looks ridiculous, but it satisfies MSVC...
看起来很可笑,但它满足MSVC...
class Thing {
. . .
};
void handleThing(Thing* thing) {
// do stuff
}
vector<Thing> vec;
// put some elements into vec now
for (auto it = vec.begin(); it != vec.end(); ++it)
// handleThing(it); // this doesn't work, would have been elegant ..
handleThing(&*it); // this DOES work
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.