简体   繁体   English

读取和写入结构向量到文件

[英]reading and writing a vector of structs to file

I've read a few posts on Stack Overflow and a number of other site about writing vectors to files. 我读了一些关于Stack Overflow的文章,以及其他一些有关将向量写入文件的站点。 I've implemented what I feel is working, but I'm having some troubles. 我已经实现了自己的工作方式,但遇到了一些麻烦。 One of the data members in the struct is a class string, and when reading the vector back in, that data is lost. 结构中的数据成员之一是类字符串,并且当重新读入向量时,该数据将丢失。 Also, after writing the first iteration, additional iterations cause a malloc error. 同样,在编写第一个迭代之后,其他迭代也会导致malloc错误。 How can I modify the code below to achieve my desired ability to save the vector to a file, then read it back in when the program launches again? 如何修改下面的代码,以实现将向量保存到文件中所需的功能,然后在程序再次启动时将其读回? Currently, the read is done in the constructor, write in destructor, of a class who's only data member is the vector, but has methods to manipulate that vector. 当前,读取是在一个类的构造函数中完成的,在析构函数中进行写入的,该类的唯一数据成员是向量,但具有操作该向量的方法。

Here is the gist of my read / write methods. 这是我的读/写方法的要点。 Assuming vector<element> elements ... 假设vector<element> elements ...

Read: 读:

ifstream infile;
infile.open("data.dat", ios::in | ios::binary);
infile.seekg (0, ios::end);
elements.resize(infile.tellg()/sizeof(element));
infile.seekg (0, ios::beg);
infile.read( (char *) &elements[0], elements.capacity()*sizeof(element));
infile.close();

Write: 写:

ofstream outfile;
outfile.open("data.dat", ios::out | ios::binary | ios_base::trunc);
elements.resize(elements.size());
outfile.write( (char *) &elements[0], elements.size() * sizeof(element));
outfile.close();

Struct element: 结构元素:

struct element {
int id;
string test;
int other;        
};

In C++, memory can not generally be directly read and written to disk directly like that. 在C ++中,通常不能像这样直接将内存直接读取和写入磁盘。 In particular, your struct element contains a string , which is a non- POD data type, and therefore cannot be directly accessed. 特别是,您的struct element包含一个string ,它是非POD数据类型,因此不能直接访问。

A thought experiment might help clarify this. 进行思想实验可能有助于澄清这一点。 Your code assumes that all your element values are the same size. 您的代码假定所有element值均大小相同。 What would happen if one of the string test values was longer than what you've assumed? 如果其中一个string test值比您预期的长,会发生什么? How would your code know what size to use when reading and writing to disk? 您的代码如何知道在读写磁盘时使用什么大小?

You will want to read about serialization for more information about how to handle this. 您将需要阅读有关序列化的更多信息,以了解如何处理序列化

You code assumes all the relevant data exists directly inside the vector, whereas strings are fixed-sized objects that have pointers which can addres their variable sized content on the heap. 您的代码假定所有相关数据都直接存在于向量内部,而字符串是固定大小的对象,该对象的指针可以在堆中增加其大小可变的内容。 You're basically saving the pointers and not the text. 您基本上是在保存指针,而不是文本。 You should write a some string serialisation code, for example: 您应该编写一些字符串序列化代码,例如:

bool write_string(std::ostream& os, const std::string& s)
{
    size_t n = s.size();
    return os.write(n, sizeof n) && os.write(s.data(), n);
}

Then you can write serialisation routines for your struct. 然后,您可以为您的结构编写序列化例程。 There are a few design options: - many people like to declare Binary_IStream / Binary_OStream types that can house a std::ostream, but being a distinct type can be used to create a separate set of serialisation routines ala: 有几种设计选项:-许多人喜欢声明可以容纳std :: ostream的Binary_IStream / Binary_OStream类型,但可以使用不同的类型来创建单独的序列化例程集ala:

operator<<(Binary_OStream& os, const Some_Class&);

Or, you can just abandon the usual streaming notation when dealing with binary serialisation, and use function call notation instead. 或者,您可以只在处理二进制序列化时放弃通常的流符号,而可以使用函数调用符号。 Obviously, it's nice to let the same code correctly output both binary serialisation and human-readable serialisation, so the operator-based approach is appealing. 显然,让相同的代码正确输出二进制序列化和人类可读的序列化是很好的,因此基于操作员的方法很有吸引力。

If you serialise numbers, you need to decide whether to do so in a binary format or ASCII. 如果要序列化数字,则需要确定是以二进制格式还是以ASCII进行序列化。 With a pure binary format, where portable is required (even between 32-bit and 64-bit compiles on the same OS), you may need to make some effort to encode and use type size metadata (eg int32_t or int64_t?) as well as endianness (eg consider network byte order and ntohl()-family functions). 对于纯二进制格式,在需要可移植的情况下(甚至在同一OS上在32位和64位之间进行编译),您可能还需要付出一些努力来编码和使用类型大小的元数据(例如int32_t或int64_t?)。作为字节序(例如,考虑网络字节顺序和ntohl()系列功能)。 With ASCII you can avoid some of those considerations, but it's variable length and can be slower to write/read. 使用ASCII可以避免这些注意事项,但是它的长度可变,并且写入/读取的速度可能较慢。 Below, I arbitrarily use ASCII with a '|' 在下面,我随意地将ASCII与'|'一起使用 terminator for numbers. 数字的终止符。

bool write_element(std::ostream& os, const element& e)
{
    return (os << e.id << '|') && write_string(os, e.test) && (os << e.other << '|');
}

And then for your vector: 然后为您的向量:

os << elements.size() << '|';
for (std::vector<element>::const_iterator i = elements.begin();
     i != elements.end(); ++i)
    write_element(os, *i);

To read this back: 阅读此内容:

std::vector<element> elements;
size_t n;
if (is >> n)
    for (int i = 0; i < n; ++i)
    {
        element e;
        if (!read_element(is, e))
            return false; // fail
        elements.push_back(e);
   }

...which needs... ...需要...

bool read_element(std::istream& is, element& e)
{
    char c;
    return (is >> e.id >> c) && c == '|' &&
           read_string(is, e.test) &&
           (is >> e.other >> c) && c == '|';
}

...and... ...和...

bool read_string(std::istream& is, std::string& s)
{
    size_t n;
    char c;
    if ((is >> n >> c) && c == '|')
    {
        s.resize(n);
        return is.read(s.data(), n);
    }
    return false;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM