简体   繁体   中英

Reading and writing C++ vector to a file

For some graphics work I need to read in a large amount of data as quickly as possible and would ideally like to directly read and write the data structures to disk. Basically I have a load of 3d models in various file formats which take too long to load so I want to write them out in their "prepared" format as a cache that will load much faster on subsequent runs of the program.

Is it safe to do it like this? My worries are around directly reading into the data of the vector? I've removed error checking, hard coded 4 as the size of the int and so on so that i can give a short working example, I know it's bad code, my question really is if it is safe in c++ to read a whole array of structures directly into a vector like this? I believe it to be so, but c++ has so many traps and undefined behavour when you start going low level and dealing directly with raw memory like this.

I realise that number formats and sizes may change across platforms and compilers but this will only even be read and written by the same compiler program to cache data that may be needed on a later run of the same program.

#include <fstream>
#include <vector>

using namespace std;

struct Vertex
{
    float x, y, z;
};

typedef vector<Vertex> VertexList;

int main()
{
    // Create a list for testing
    VertexList list;
    Vertex v1 = {1.0f, 2.0f,   3.0f}; list.push_back(v1);
    Vertex v2 = {2.0f, 100.0f, 3.0f}; list.push_back(v2);
    Vertex v3 = {3.0f, 200.0f, 3.0f}; list.push_back(v3);
    Vertex v4 = {4.0f, 300.0f, 3.0f}; list.push_back(v4);

    // Write out a list to a disk file
    ofstream os ("data.dat", ios::binary);

    int size1 = list.size();
    os.write((const char*)&size1, 4);
    os.write((const char*)&list[0], size1 * sizeof(Vertex));
    os.close();


    // Read it back in
    VertexList list2;

    ifstream is("data.dat", ios::binary);
    int size2;
    is.read((char*)&size2, 4);
    list2.resize(size2);

     // Is it safe to read a whole array of structures directly into the vector?
    is.read((char*)&list2[0], size2 * sizeof(Vertex));

}

As Laurynas says, std::vector is guaranteed to be contiguous, so that should work, but it is potentially non-portable.

On most systems, sizeof(Vertex) will be 12, but it's not uncommon for the struct to be padded, so that sizeof(Vertex) == 16 . If you were to write the data on one system and then read that file in on another, there's no guarantee that it will work correctly.

You might be interested in the Boost.Serialization library. It knows how to save/load STL containers to/from disk, among other things. It might be overkill for your simple example, but it might become more useful if you do other types of serialization in your program.

Here's some sample code that does what you're looking for:

#include <algorithm>
#include <fstream>
#include <vector>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/vector.hpp>

using namespace std;

struct Vertex
{
    float x, y, z;
};

bool operator==(const Vertex& lhs, const Vertex& rhs)
{
    return lhs.x==rhs.x && lhs.y==rhs.y && lhs.z==rhs.z;
}

namespace boost { namespace serialization {
    template<class Archive>
    void serialize(Archive & ar, Vertex& v, const unsigned int version)
    {
        ar & v.x; ar & v.y; ar & v.z;
    }
} }

typedef vector<Vertex> VertexList;

int main()
{
    // Create a list for testing
    const Vertex v[] = {
        {1.0f, 2.0f,   3.0f},
        {2.0f, 100.0f, 3.0f},
        {3.0f, 200.0f, 3.0f},
        {4.0f, 300.0f, 3.0f}
    };
    VertexList list(v, v + (sizeof(v) / sizeof(v[0])));

    // Write out a list to a disk file
    {
        ofstream os("data.dat", ios::binary);
        boost::archive::binary_oarchive oar(os);
        oar << list;
    }

    // Read it back in
    VertexList list2;

    {
        ifstream is("data.dat", ios::binary);
        boost::archive::binary_iarchive iar(is);
        iar >> list2;
    }

    // Check if vertex lists are equal
    assert(list == list2);

    return 0;
}

Note that I had to implement a serialize function for your Vertex in the boost::serialization namespace. This lets the serialization library know how to serialize Vertex members.

I've browsed through the boost::binary_oarchive source code and it seems that it reads/writes the raw vector array data directly from/to the stream buffer. So it should be pretty fast.

std::vector保证在内存中是连续的,所以,是的。

I just ran into this exact same problem.

First off, these statements are broken

os.write((const char*)&list[0], size1 * sizeof(Vertex));
is.read((char*)&list2[0], size2 * sizeof(Vertex));

There is other stuff in the Vector data structure, so this will make your new vector get filled up with garbage.

Solution:
When you are writing your vector into a file, don't worry about the size your Vertex class, just directly write the entire vector into memory.

os.write((const char*)&list, sizeof(list));

And then you can read the entire vector into memory at once

is.seekg(0,ifstream::end);
long size2 = is.tellg();
is.seekg(0,ifstream::beg);
list2.resize(size2);
is.read((char*)&list2, size2);

Another alternative to explicitly reading and writing your vector<> from and to a file is to replace the underlying allocator with one that allocates memory from a memory mapped file. This would allow you to avoid an intermediate read/write related copy. However, this approach does have some overhead. Unless your file is very large it may not make sense for your particular case. Profile as usual to determine if this approach is a good fit.

There are also some caveats to this approach that are handled very well by the Boost.Interprocess library. Of particular interest to you may be its allocators and containers .

If this is used for caching by the same code, I don't see any problem with this. I've used this same technique on multiple systems without a problem (all Unix based). As an extra precaution, you might want to write a struct with known values at the beginning of the file, and check that it reads ok. You might also want to record the size of the struct in the file. This will save a lot of debugging time in the future if the padding ever changes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM