简体   繁体   中英

Fastest way to read a vector<double> from file

I have 3 vector, each with exactly 256^3 ~ 16 million elements that i want to store in a file and read as fast as possible. I only care about reading performance, and the representation of the data in memory can be any.

I have taken a look at some serialization techniques as well as writing/ reading plain numbers to/ from a file with ofstream, however i wonder if there is a more direct and faster approach.

(i am pretty new to c++ and its concepts)

Assuming both systems, windows and android, are little endian, which is common in ARM and x86/x64 CPUs, you can do the following.

First: Determine the type with a sepcific size, so either double, with 64-bit, float with 32-bit, or uint64/32/16 or int64/32/16. Do NOT use stuff like int or long to determine your data type.

Second: Use the following method to write binary data:

std::vector<uint64_t> myVec;
std::ofstream f("outputFile.bin", std::ios::binary);
f.write(reinterpret_cast<char*>(myVec.data()), myVec.size()*sizeof(uint64_t));
f.close();

In this, you're take the raw data and writing its binary format to a file.

Now on other machine, make sure the data type you use has the same datatype size and same endianness . If both are the same, you can do this:

std::vector<uint64_t> myVec(sizeOfTheData);
std::ifstream f("outputFile.bin", std::ios::binary);
f.read(reinterpret_cast<char*>(&myVec.front()), myVec.size()*sizeof(uint64_t));
f.close();

Notice that you have to know the size of the data before reading it.

Note: This code is off my head. I haven't tested it, but it should work.

Now if the target system doesn't have the same endianness, you have to read the data in batches, flip the endianness, then put it in your vector. How to flip endianness was extensively discussed here .

To determine the endianness of your system, this was discussed here .

The penalty on performance will be proportional to how different these systems are. If they're both the same endianness and you choose the same data type and size, you're good and you have optimum performance. Otherwise, you'll have some penalty depending on how many conversion you have to do. This is the fastest that you can ever get.

Note from comments: If you're transferring doubles or floats, make sure both systems use IEEE 754 standard. It's very common to use these, way more than endianness, but just to be sure.

Now if these solutions don't fit you, then you have to use a proper serialization library to standardize the format for you. There are libraries that can do that, such as protobuf .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM