简体繁体 English

size_type可以大于std :: size_t吗？

[英]Can a size_type ever be larger than std::size_t?

原文 2012-09-19 16:11:59 8 7 c++/ stl/ allocator/ size-t

Standard containers with an std::allocator have their size_type defined as std::size_t . 具有std::allocator标准容器将其size_type定义为std::size_t 。 However, is it possible to have an allocator that allocates objects whose size cannot be represented by a size_t ? 但是，是否可以使用分配器来分配大小无法用size_t表示的对象？ In other words, can a size_type ever be larger than size_t ? 换句话说， size_type是否可以大于size_t ？

7 个解决方案

Yes, and this could be useful in some cases. 是的，这在某些情况下可能有用。

Suppose you have a program that wishes to access more storage than will fit in virtual memory. 假设您有一个程序希望访问的存储空间超过虚拟内存容量。 By creating an allocator that references memory mapped storage and mapping it as required when indirecting pointer objects, you can access arbitrarily large amounts of memory. 通过创建引用内存映射存储的分配器并在间接pointer对象时根据需要进行映射，可以访问任意大量的内存。

This remains conformant to 18.2:6 because size_t is defined as large enough to contain the size of any object, but 17.6.3.5:2 table 28 defines size_type as containing the size of the largest object in the allocation model , which need not be an actual object in the C++ memory model. 这仍然符合18.2：6，因为size_t定义为足以包含任何对象的大小，但17.6.3.5:2表28将size_type定义为包含分配模型中最大对象的大小，不必是C ++内存模型中的实际对象。

Note that the requirements in 17.6.3.5:2 table 28 do not constitute a requirement that the allocation of multiple objects should result in an array; 请注意，表18中的17.6.3.5:2中的要求并不构成多个对象的分配应该产生数组的要求; for allocate(n) the requirement is: 对于allocate(n) ，要求是：

Memory is allocated for n objects of type T 为n类型的T对象分配内存

and for deallocate the assertion is: 并且对于deallocate ，断言是：

All n T objects in the area pointed to by p shall be destroyed prior to this call. p指向的区域中的所有n T对象应在此呼叫之前销毁。

Note area , not array . 注意区域，而不是数组。 Another point is 17.6.3.5:4: 另一点是17.6.3.5：4：

The X::pointer , X::const_pointer , X::void_pointer , and X::const_void_pointer types shall satisfy the requirements of NullablePointer (17.6.3.3). X::pointer ， X::const_pointer ， X::void_pointer和X::const_void_pointer类型应满足NullablePointer（17.6.3.3）的要求。 No constructor, comparison operator, copy operation, move operation, or swap operation on these types shall exit via an exception. 对这些类型的构造函数，比较运算符，复制操作，移动操作或交换操作不应通过异常退出。 X::pointer and X::const_pointer shall also satisfy the requirements for a random access iterator (24.2). X::pointer和X::const_pointer也应满足随机访问迭代器（24.2）的要求。

There is no requirement here that (&*p) + n should be the same as p + n . 这里没有要求(&*p) + n应该与p + n相同。

It's perfectly legitimate for a model expressible within another model to contain objects not representable in the outer model; 对于在另一个模型中可表达的模型来说，包含在外部模型中无法表示的对象是完全合法的; for example, non-standard models in mathematical logic. 例如，数学逻辑中的非标准模型。

size_t is the type of the unsigned integer you get by applying sizeof . size_t是通过应用sizeof获得的无符号整数的sizeof 。

sizeof should return the size of the type (or of the type of the expression) that is his argument. sizeof应该返回作为其参数的类型（或表达式类型）的大小。 In case of arrays it should return the size of the whole array. 在数组的情况下，它应该返回整个数组的大小。

This implies that: 这意味着：

there cannot be ANY structure or union that is larger than what size_t can represent. 不能有任何大于size_t可以表示的结构或联合。
there cannot be any array that is larger than what size_t can represent. 不能有任何大于size_t可以表示的数组。

In other words, if something fits in the largest block of consecutive memory that you can access, then its size must fit in size_t (in non-portable, but easy to grasp intuitively terms this means that on most systems size_t is as large as void* and can 'measure' the whole of your virtual address space). 换句话说，如果某些东西适合你可以访问的最大连续内存块，那么它的大小必须适合size_t（非便携式，但易于直观地理解，这意味着在大多数系统中size_t与void*一样大） void*并且可以“测量”整个虚拟地址空间。

Edit: this next sentence is probably wrong. 编辑：下一句可能是错的。 See below 见下文

Therefore the answer to is it possible to have an allocator that allocates objects whose size cannot be represented by a size_t ? 因此答案是否有可能有一个分配器分配大小不能用size_t表示的对象？ is no. 没有。

Edit (addendum): 编辑（附录）：

I've been thinking about it and the above my be in fact wrong. 我一直在想它，而上面我实际上是错的。 I've checked the standard and it seems to be possible to design a completely custom allocator with completely custom pointer types, including using different types for pointer, const pointer, void pointer and const void pointer. 我检查了标准，似乎可以设计一个完全自定义指针类型的完全自定义分配器，包括使用不同类型的指针，const指针，void指针和const void指针。 Therefore an allocator can in fact have a size_type that is larger than size_t. 因此，分配器实际上可以具有大于size_t的size_type。

But to do so you need to actually define completely custom pointer types and the corresponding allocator and allocator traits instances. 但要这样做，您需要实际定义完全自定义指针类型以及相应的allocator和allocator traits实例。

The reason I say may is that I'm still a bit unclear if the size_type needs to span the size of the single object or also the size of multiple objects (that is an array) in the allocator model. 我之所以说可能的原因是，我仍然有点不清楚size_type需要跨越单个对象的大小或者是分配器模型中多个对象（即数组）的大小。 I will need to investigate this detail (but not now, it's dinner time here :) ) 我需要调查这个细节（但现在不是，这是晚餐时间:)）

Edit2 (new addendum): Edit2（新附录）：

@larsmans I think you may want to decide what to accept anyway. @larsmans我想你可能想要决定接受什么。 The problem seems to be a little more complicated than one may intuitively realize. 问题似乎比人们直观地意识到的要复杂得多。 I'm editing the answer again as my thoughts are definitively more than a comment (both in content and in size). 我正在编辑答案，因为我的想法绝对不仅仅是评论（内容和大小）。

ReEdit (as pointed out in the comments the next two paragraphs are not correct): ReEdit（正如评论中指出的那样，下两段不正确）：

First of all size_type is just a name. 首先， size_type只是一个名称。 You can of course define a container and add a size_type to it with whatever meaning you wish. 您当然可以定义容器并使用您希望的任何含义为其添加size_type 。 Your size_type could be a float, a string whatever. 你的size_type可以是浮点数，也可以是字符串。

That said in standard library containers size_type is defined in the container only to make it easy to access. 也就是说，在标准库容器中， size_type只在容器中定义，以便于访问。 It's in fact supposed to be identical to the size_type of the allocator for that container (and the size_type of the allocator should be the size_type of the allotator_traits of that allocator). 它实际上应该是等同于size_type分配器为容器（和size_type分配器的应该是size_type该分配器的allotator_traits的）。

Therefore we shall henceforth assume that the size_type of the container, even one you define, follows the same logic 'by convention'. 因此，我们今后将假设容器的size_type ，即使是您定义的容器，遵循相同的“按惯例”逻辑。 @BenVoight begins his answer with "As @AnalogFile explains, no allocated memory can be larger than size_t. So a container which inherits its size_type from an allocator cannot have size_type larger than size_t.". @BenVoight开始回答“正如@AnalogFile解释的那样，没有分配的内存可以大于size_t。所以从分配器继承其size_type的容器不能使size_type大于size_t。”。 In fact we are now stipulating that if a container has a size_type then that comes from the allocator (he says inherit, but that of course is not in the common sense of class inheritance). 事实上，我们现在规定如果一个容器有一个size_type那么它来自分配器（他说继承，但那当然不是类继承的常识）。

However he may or may not be 100% right that a size_type (even if it comes from an allocator) is necessarily constrained to size_t . 但是，他可能或者可能不是100％正确的size_type （即使它来自分配器）必然被约束为size_t 。 The question really is: can an allocator (and the corresponding traits) define a size_type that is larger than size_t ? 问题实际上是：分配器（和相应的特征）可以定义大于size_t的size_type吗？

Both @BenVoight and @ecatmur suggest a usecase where the backing store is a file. @BenVoight和@ecatmur都建议使用后备存储是文件的用例。 However if the backing store is a file only for the content and you have something in memory that refers to that content (let's call that an 'handle'), then you are in fact doing a container that contains handles. 但是，如果后备存储是仅用于内容的文件，并且您在内存中有一些引用该内容的内容（让我们称之为“句柄”），那么您实际上在做一个包含句柄的容器。 A handle will be an instance of some class that stores the actual data on a file and only keeps in memory whatever it needs to retrieve that data, but this is irrelevant to the container: the container will store the handles and those are in memory and we still are in the 'normal' address space, so my initial response is still valid. 句柄将是某个类的实例，它将实际数据存储在文件中，并且只保留在内存中检索该数据所需的任何内容，但这与容器无关：容器将存储句柄，这些句柄位于内存中，我们仍处于“正常”地址空间，因此我的初始响应仍然有效。

There is another case, however. 然而，还有另一种情况。 You are not allocating handles, you are actually storing stuff in the file (or database) and your allocator (and relative traits) define pointer, const pointer, void pointer, const void pointer etc. types that directly manage that backing store. 你没有分配句柄，你实际上是在文件（或数据库）中存储东西，你的分配器（和相对特征）定义指针，const指针，void指针，const void指针等直接管理该后备存储的类型。 In this case, of course, they also need to define the size_type (replacing size_t ) and difference_type (replacing ptrdiff_t) to match. 在这种情况下，当然，他们还需要定义size_type （替换size_t ）和difference_type （替换ptrdiff_t）来匹配。

The direct difficulties in defining size_type (and difference_type ) as larger than size_t when size_t is already as large as the largest implementation provided primitive integral type (if not, then there are no difficulties) are related to the fact that they need to be integer types . 当size_t已经与最大实现提供的原始整数类型（如果没有，那么没有困难）一样大时，将size_type （和difference_type ）定义为大于size_t的直接困难与它们需要是integer types的事实有关。

Depending on how you interpret the standard this may be impossible (because according to the standard integer types are the types defined in the standard plus the extended integer types provided by the implementation) or possible (if you interpret it such that you can provide an extended integer type yourself) as long as you can write a class that behaves exactly like an primitive type. 根据您对标准的解释方式，这可能是不可能的（因为根据标准integer types是标准中定义的类型加上实现提供的extended integer types ）或者可能（如果您将其解释为可以提供extended integer type你自己）只要你能编写一个行为与基本类型完全相同的类。 This was impossible in the old times (overloading rules did make primitive types always distinguishable from user defined types), but I'm not 100% up-to-date with C++11 and this may (or may not be changed). 这在过去是不可能的（重载规则确实使原始类型总是与用户定义的类型区分开来），但我不是100％最新的C ++ 11，这可能（或可能不会改变）。

However there are also indirect difficulties. 然而，也存在间接困难。 You not only need to provide a suitable integer type for size_type . 您不仅需要为size_type提供合适的整数类型。 You also need to provide the rest of the allocator interface. 您还需要提供其余的分配器接口。

I've been thinking about it a little and one problem I see is in implementing *p according to 17.6.3.5. 我一直在考虑它，我看到的一个问题是根据17.6.3.5实现*p 。 In that *p syntax p is a pointer as typed by the allocator traits. 在那个*p语法中， p是由分配器特征键入的pointer 。 Of course we can write a class and define an operator* (the nullary method version, doing pointer dereferece). 当然，我们可以编写一个类并定义一个operator* （nullary方法版本，执行指针解除引用）。 And one may think that this can be easily done by 'paging in' the relative part of the file (as @ecatmur suggests). 有人可能认为这可以通过'分页'文件的相对部分轻松完成（如@ecatmur建议的那样）。 However there's a problem: *p must be a T& for that object. 但是有一个问题： *p必须是该对象的T& 。 Therefore the object itself must fit in memory and, more importantly, since you may do T &ref = *p and hold that reference indefinitely, once you have paged in the data you will never be allowed to page it out any more. 因此，对象本身必须适合内存，更重要的是，因为您可以执行T &ref = *p并无限期地保留该引用，一旦您在数据中分页，您将永远不会被允许再将其分页。 This means that effectively there may be no way to properly implement such an allocator unless the whole backing store can also be loaded into memory. 这意味着有效地可能无法正确实现这样的分配器，除非整个后备存储也可以加载到内存中。

Those are my early observations and seem to actually confirm my first impression that the real answer is no: there is no practical way to do it. 这些是我早期的观察，似乎确实证实了我的第一印象，即真正的答案是否定的：没有实际可行的方法。

However, as you see, things are much more complicated than mere intuition seems to suggest. 然而，正如你所看到的，事情要比直觉似乎更为复杂。 It may take quite a time to find a definitive answer (and I may or may not go ahead and research the topic further). 找到一个明确的答案可能需要相当长的时间（我可能会或可能不会继续进一步研究这个话题）。

For the moment I'll just say: it seems not to be possible . 目前我只想说： 似乎不可能 。 Statements to the contrary shall only be acceptable if they are not based solely on intuition: post code and let people debate if your code fully conforms to 17.6.3.5 and if your size_type (which shall be larger than size_t even if size_t is as large as the largest primitive integer type) can be considered an integer type. 如果声明不完全基于直觉，则只能接受相反的陈述：发布代码并让人们争论如果你的代码完全符合17.6.3.5以及你的size_type （即使size_t大于size_t也应大于size_t ）最大的原始整数类型）可以被认为是整数类型。

Yes and no. 是的，不是。

As @AnalogFile explains, no allocated memory can be larger than size_t . 正如@AnalogFile所解释的那样，没有分配的内存可以大于size_t 。 So a container which inherits its size_type from an allocator cannot have size_type larger than size_t . 因此，它继承其容器size_type从分配器不能有size_type大于size_t 。

However, you can design a container type which represents a collection not entirely stored in addressable memory. 但是，您可以设计一个容器类型，表示不完全存储在可寻址内存中的集合。 For example, the members could be on disk or in a database. 例如，成员可以位于磁盘上或数据库中。 They could even be computed dynamically, eg a Fibonacci sequence, and never stored anywhere at all. 它们甚至可以动态计算，例如Fibonacci序列，并且从不存储在任何地方。 In such cases, size_type could easily be larger than size_t . 在这种情况下， size_type可能很容易大于size_t 。

I'm sure its buried in the standard somewhere, but the best description i've seen for size_type is from the SGI-STL documentation. 我确定它被埋在标准的某个地方，但我见过的最好的size_type描述来自SGI-STL文档。 As I said, i'm sure it is in the standard, and if someone can point it out, by all means do. 正如我所说的那样，我确信它符合标准，如果有人能够指出它，那么一定要做到。

According to SGI, a container's size_type is: 根据SGI，容器的size_type是：

An unsigned integral type that can represent any nonnegative value of the container's distance type 无符号整数类型，可表示容器距离类型的任何非负值

It makes no claims that is must be anything besides that. 除此之外，它没有任何声明。 In theory you could define a container that uses uint64_t, unsigned char, and anything else in between. 理论上，您可以定义一个使用uint64_t，unsigned char以及其他任何内容的容器。 That it is referencing the container's distance_type is the part I find interesting, since... 它引用容器的distance_type是我觉得有趣的部分，因为......

distance_type: A signed integral type used to represent the distance between two of the container's iterators. distance_type：一个带符号的整数类型，用于表示两个容器迭代器之间的距离。 This type must be the same as the iterator's distance type. 此类型必须与迭代器的距离类型相同。

This doesn't really answer the question, though, but it is interesting to see how size_type and size_t differ (or can). 但是，这并没有真正回答这个问题，但是看看size_type和size_t的不同（或者可以）有趣。 Regarding your question, see (and up vote) @AnalogFile s answer, as I believe it to be correct. 关于你的问题，请参阅（和投票）@AnalogFile的回答，因为我认为这是正确的。

From §18.2/6 从§18.2/ 6开始

The type size_t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object. 类型size_t是一个实现定义的无符号整数类型，它足够大，可以包含任何对象的字节大小。

So, if it were possible for you to allocate an object whose size cannot be represented by a size_t it would make the implementation non-conforming. 因此，如果您可以分配一个大小无法用size_t表示的对象，那么它将使实现不符合要求。

To add to the "standard" answers, also note the stxxl project which is supposed to be able to handle terabytes of data using disk storage (perhaps by extension, network storage). 要添加到“标准”答案，还要注意stxxl项目，该项目应该能够使用磁盘存储（可能通过扩展，网络存储）处理数TB的数据。 See the header of vector for example, for the definition of size_type ( line 731 , and line 742 ) as uint64. 例如，将size_type （第731 行和第742行）定义为uint64，请参见vector的标题。

This is a concrete example of using containers with larger sizes than memory can afford, or that even the system's integer can handle. 这是使用比内存可以承受的更大尺寸的容器的一个具体示例，或者即使系统的整数也可以处理。

Not necessarily. 不必要。

I assume by size_type you mean the typedef inside most STL containers? 我假设size_type是指大多数STL容器中的typedef？

If so, then just because size_type was added to all the containers instead of just using size_t means that the STL is reserving the right to make size_type any type they like. 如果是这样，那么只是因为size_type被添加到所有容器而不是仅使用size_t意味着STL保留了使size_type成为他们喜欢的任何类型的权利。 (By default, in all implementations I'm aware of size_type is a typedef of size_t). （默认情况下，在所有实现中，我都知道size_type是size_t的typedef）。