简体   繁体   English

计算向量的向量大小(以字节为单位)

[英]Calculating size of vector of vectors in bytes

typedef vector<vector<short>> Mshort;
typedef vector<vector<int>> Mint;

Mshort mshort(1 << 20, vector<short>(20, -1)); // Xcode shows 73MB 
Mint mint(1 << 20, vector<int>(20, -1)); // Xcode shows 105MB

short uses 2 bytes and int 4 bytes; short使用2个字节,int使用4个字节; please note that 1 << 20 = 2^20 ; 请注意1 << 20 = 2^20 ;

I am trying to calculate ahead (on paper) usage of memory but I am unable to. 我正在尝试提前计算(在纸上)内存使用量,但无法计算。

sizeof(vector<>) // = 24 //no matter what type
sizeof(int) // = 4
sizeof(short) // = 2

I do not understand: mint should be double the mshort but it isn't. 我不明白: mint应该是mshort两倍,但事实并非如此。 When running program only with mshort initialisation Xcode shows 73MB of memory usage; 仅在使用mshort初始化运行程序时, Xcode显示73MB的内存使用情况; for mint 105MB ; mint 105MB ;

mshort.size() * mshort[0].size() * sizeof(short) * sizeof(vector<short>) // = 1006632960
mint.size() * min[0].size() * sizeof(int) * sizeof(vector<int>) // = 2013265920

//no need to use .capacity() because I fill vectors with -1
1006632960 * 2 = 2013265920

How does one calculate how much space of RAM will 2d std::vector use or 2d std::array use. 如何计算2d std::vector使用或2d std::array使用多少RAM空间。

I know the sizes ahead and each row has same number of columns. 我知道前面的大小,每一行都有相同的列数。

The memory usage of your vectors of vectors will be eg 向量的向量在内存中的使用例如

// the size of the data...
mshort.size() * mshort[0].size() * sizeof(short) +

// the size of the inner vector objects...
mshort.size() * sizeof mshort[0] +

// the size of the outer vector object...
// (this is ostensibly on the stack, given your code)
sizeof mshort +

// dynamic allocation overheads
overheads

The dynamic allocation overheads are because the vector s internally new memory for the elements they're to store, and for speed reasons they may have pools of fixed-sized memory areas waiting for new requests, so if the vector effectively does a new short[20] - with the data needing 40 bytes - it might end up with eg 48 or 64. The implementation may actually need to use some extra memory to store the array size, though for short and int there's no need to loop over the elements invoking destructors during delete[] , so a good implementation will avoid that allocation and no-op destruction behaviour. 动态分配开销是因为vector在内部为其要存储的元素提供了new内存,并且出于速度方面的考虑,它们可能有固定大小的内存区域池等待新的请求,因此,如果vector有效地执行了一个new short[20] -数据需要40个字节-最终可能会以48或64结尾。实现可能实际上需要使用一些额外的内存来存储数组大小,尽管shortintint并不需要遍历元素调用在delete[]期间使用析构函数,因此良好的实现将避免这种分配和no-op破坏行为。

The actual data elements for any given vector are contiguous in memory though, so if you want to reduce the overheads, you can change your code to use fewer, larger vector s. 但是,任何给定向量的实际数据元素在内存中都是连续的,因此,如果要减少开销,可以更改代码以使用更少,更大的vector For example, using one vector with (1 << 20) * 20 will have negligible overhead - then rather than accessing [i][j] you can access [i * 20 + j] - you can write a simple class wrapping the vector to do this for you, most simply with a v(i, j) notation... 例如,使用一个(1 << 20) * 20 vector开销可以忽略不计-然后可以访问[i * 20 + j]而不是访问[i][j] [i * 20 + j] -您可以编写一个简单的类来包装vector为此,最简单的方法是使用v(i, j)表示法...

inline short& operator()(size_t i, size_t j) { return v_[i * 20 + j]; }
inline short operator()(size_t i, size_t j) const { return v_[i * 20 + j]; }

...though you could support v[i][j] by having v.operator[] return a proxy object that can be further indexed with [] . ...尽管您可以通过让v.operator[]返回一个可以用[]进一步建立索引的代理对象来支持v[i][j] [] I'm sure if you search SO for questions on multi-dimension arrays there'll be some examples - think I may have posted such code myself once. 我确定如果您在多维数组上搜索问题,将有一些示例-认为我自己可能曾经发布过此类代码。

The main reason to want vector<vector<x>> is when the inner vector s vary in length. 想要vector<vector<x>>主要原因是当内部vector s的长度变化时。

Assuming glibc malloc: Each memory chunk will allocate additional 8-16 bytes(2 size_t) for memory block header. 假设glibc malloc:每个内存块将为内存块头分配额外的8-16字节(2 size_t)。 For 64 bit system it would be 16 bytes. 对于64位系统,它将是16个字节。 see code: https://github.com/sploitfun/lsploits/blob/master/glibc/malloc/malloc.c#L1110 参见代码: https : //github.com/sploitfun/lsploits/blob/master/glibc/malloc/malloc.c#L1110

chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |             Size of previous chunk, if allocated            | |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |             Size of chunk, in bytes                       |M|P|
  mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |             User data starts here...                          .
    .                                                               .
    .             (malloc_usable_size() bytes)                      .
    .                                                               |

It gives me approximately 83886080 for short when adding 16 bytes per row. 每行添加16个字节时,它的总长度约为83886080。

26+16+ mshort.size(1048576) * (mshort[0].size(20)*sizeof(short(2)) + sizeof(vector(26))+header(16)) 26 + 16 + mshort.size(1048576)*(mshort [0] .size(20)* sizeof(short(2))+ sizeof(vector(26))+ header(16))

It gives me approximately 125829120 for int. 它为我提供了大约125829120的整数。

But then I recompute you numbers and it look like you are on 32 bit... 但是后来我重新计算了你的数字,看起来你在32位上...

  • short 75497472 that is ~73M 短75497472即〜73M
  • long 117440512 that is ~112M 长117440512即〜112M

Looks very close to reported ones. 看起来非常接近报告的那些。

Use capacity not size to get #items number, even if those are the same in your case. 即使您的情况相同,也请使用容量而非大小来获取#items号。

Allocating single vector size row*columns will save you header*1048576 bytes. 分配单个向量大小的行*列将节省标题* 1048576字节。

Your calculation mshort.size() * mshort[0].size() * sizeof(short) * sizeof(vector<short>) // = 1006632960 is simply wrong. 您的计算mshort.size() * mshort[0].size() * sizeof(short) * sizeof(vector<short>) // = 1006632960是完全错误的。 As your calculation, mshort takes 1006632960 which is 960MiB, which is not true. 根据您的计算, mshort需要1006632960,即960MiB,这是不正确的。

Let's ignore libc's overhead, and just focus on std::vector<> 's size: mshort is a vector of 1^20 items, each is vector<short> with 20 items. 让我们忽略的libc的开销,并且仅仅关注std::vector<>的尺寸: mshort是一个vector1^20项,每项是vector<short>与20个项目。 So the size shall be: 因此大小应为:

mshort.size() * mshort[0].size() * sizeof(short) // Size of all short values + mshort.size() * sizeof(vector<short>) // Size of 1^20 vector<short> + sizeof(mshort) // Size of mshort itself, which can be ignored as overhead

The calculated size is 64MiB . 计算的大小为64MiB

The same to mint, where the calculated size is 104MiB . 与mint相同,其中计算的大小为104MiB

So mint is simply NOT double size of mshort . 因此, mint根本不是 mshort

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM