简体   繁体   中英

Size limit of arrays and vectors in c++

I want to store pow(10,12) integer numbers in an array. I know integer array can only store upto pow(10,7) integers.So what should I do now?Also, Can vector be used for this purpose? In that case what is the size limit of vector? Apart from arrays and vector, if there is any other way to accomplish this.

Additional detail : compiler= TDM-GCC 4.9.2 64-bit Release.

The most obvious limitation is not the language, it is the computer .

10 12 int -egers would take 4 terabytes of memory . Do you have access to an expensive supercomputer with that much RAM ? They typically cost millions of dollars or € ....

If you do, you'll probably can use some heap-allocated std::vector<int> or std::array<int,1000*1000*1000*1000> (at least on some x86-64 Linux supercomputer system).

But you probably don't : your computer have much less than 4 terabytes (4096 gigabytes) of RAM; if it is a desktop it might have a few dozen gigabytes at most. TDM-GCC is for Windows, and all supercomputers are today running some variant of Linux.

BTW, while heap memory is in virtual memory , in practice you'll experiment trashing if you allocate (like suggested by sehe's answer ) a terabyte data on a computer with only gigabytes of RAM. See mmap(2) & madvise(2) on Linux.

Perhaps you might access that array in chunks. Then consider storing the chunks on some database (maybe using PostGreSQL or Sqlite ) or perhaps a big binary file. You'll need a large disk space to fit the 4Tbyte requirement.

BTW, if you handle that much data, I strongly recommend to learn and use Linux on your machine and code for Linux, since all supercomputers and cloud clusters are Linux based. You could prototype your software on your Linux laptop or desktop for a small amount of data (eg 8Gbytes), then port it to some cloud or supercomputer (which will cost you big bucks).

I'm not saying you should do this. But you could leverage your system's VMM (all OS-es have it).

Here's a simple sample that allocates the file (3.7TB) - it doesn't actually write the blocks unless your filesystem doesn't support sparse files.

It then proceeds to write 5 random values at random indices in your array.

On most systems, this will end up writing max. 5 4k blocks to disk, while the file is actually 3.7TB. The operating system will deal with swapping the pages in and out on demand and writing changes back to disk.

Live On Coliru

#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <iostream>
#include <random>     // for random writes only
#include <functional> // for random writes only

int main() {
    const size_t N = 1000000000000ull;

    int fd = open("large.db", O_RDWR|O_CREAT, 0777);

    if (fd==-1)
        perror("opening");

    if (-1==fallocate64(fd, 0, sizeof(int)*N, 1))
        perror("fallocate");

    int* data = (int*) mmap64(nullptr, sizeof(int)*N, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

    if (data && data!=MAP_FAILED) {
        auto randindex = std::bind(std::uniform_int_distribution<size_t>(0, N-1), std::mt19937{ std::random_device{} () });

        for(int i=0; i<5; ++i) 
            data[randindex()] = rand();
    } else {
        perror("mmap");
    }

    if (data && munmap(data, sizeof(int)*N))
        perror("munmap");

    close(fd);
}

Inspecting the resulting large.db with eg od large.db on linux will show the changed data actually persisted to disk.

¹ Coliru limits don't allow this (obviously)

"I want to store pow(10,12) integer numbers in an array. I know integer array can only store upto pow(10,7) integers.So what should I do now?"

Create a wrapper class, that contains an array of MAX_LENGTH arrays.

When the User wants to access the (MAX_LENGTH + 10) position, or (2 * MAX_LENGTH + 11) position, your access method simply does the math to figure out which array has the correct range of values, then do the get/set on it.

If not every element in this SuperArray(tm) will be populated, then perhaps use a sparse vector/array?

Caveat: your system might not have the RAM+swap to support it, and you will end up doing some fun exotic to-filesystem-files<==>RAM solution, or to a database table(s), or ???

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM