[英]Serializing values to a string of bytes in a platform-independent way
I'm writing some serialization code that will work at a lower level than I'm used to. 我正在编写一些序列化代码,这些代码将在比以前更低的级别上工作。 I need functions to take various value types ( int32_t
, int64_t
, float
, etc.) and shove them into a vector<unsigned char>
in preparation for being written to a file. 我需要使用各种值类型( int32_t
, int64_t
, float
等)的函数,并将它们推到vector<unsigned char>
以准备写入文件。 The file will be read and reconstituted in an analogous way. 该文件将以类似的方式读取和重构。
The functions to write to the vector look like this: 写入向量的函数如下所示:
void write_int32(std::vector<unsigned char>& buffer, int32_t value)
{
buffer.push_back((value >> 24) & 0xff);
buffer.push_back((value >> 16) & 0xff);
buffer.push_back((value >> 8) & 0xff);
buffer.push_back(value & 0xff);
}
void write_float(std::vector<unsigned char>& buffer, float value)
{
assert(sizeof(float) == sizeof(int32_t));
write_int32(buffer, *(int32_t *)&value);
}
These bit-shifting, type-punning atrocities seem to work, on the single machine I've used so far, but they feel extremely fragile. 在我到目前为止使用的单台机器上,这些有点移位,易于处理的暴行似乎可以正常工作,但它们却非常脆弱。 Where can I learn which operations are guaranteed to yield the same results across architectures, float representations, etc.? 在哪里可以了解到哪些操作可以保证在体系结构,浮点表示等方面产生相同的结果? Specifically, is there a safer way to do what I've done in these two example functions? 具体来说,是否有更安全的方法来完成我在这两个示例函数中所做的工作?
A human readable representation is the most safe. 可读的表示法是最安全的。 XML with an xsd is one option that can allow you to exactly specify value and format. 带有xsd的XML是一种选项,可以允许您精确地指定值和格式。
If you really want a binary representation, look at the hton*
and ntoh*
functions: 如果您真的想要二进制表示形式,请查看hton*
和ntoh*
函数:
http://beej.us/guide/bgnet/output/html/multipage/htonsman.html http://beej.us/guide/bgnet/output/html/multipage/htonsman.html
Usually the best way to do this is to employ an external library designed for this purpose -- it's all to easy to introduce platform disagreement bugs, especially when trying to transmit info like floating point types. 通常,最好的方法是使用为此目的而设计的外部库-引入平台不一致错误很容易,尤其是在尝试传输浮点类型之类的信息时。 There are multiple options for open-source software that does this. 开源软件有多个选项可以做到这一点。 One example is Google Protocol Buffers , which in addition to being platform-neutral has the benefit of being language-independent (it generates code for use in serialization based on messages you define). 一个示例是Google Protocol Buffers ,它除了与平台无关外,还具有与语言无关的优点(它根据您定义的消息生成用于序列化的代码)。
I wanted something quick and lightweight so I whipped up a simple and stupid text serialization format. 我想要快速,轻便的东西,所以我提出了一种简单而愚蠢的文本序列化格式。 Each value is written to the file using something barely more complicated than 每个值都使用几乎没有什么复杂的东西写入文件
output_buffer << value << ' ';
Protocol Buffers would have worked okay but I was worried they'd take too long to integrate. Protocol Buffers可以正常工作,但我担心它们需要太长时间才能集成。 XML's verbosity would have been a problem for me—I need to serialize thousands of values and even having <a>...</a>
wrapping each number would have added nearly a megabyte to each file. XML的冗长性对我来说将是一个问题-我需要序列化数千个值,甚至用<a>...</a>
包装每个数字都会为每个文件增加近兆字节。 I tried MessagePack but it just seemed like an awkward fit with C++'s static typing. 我尝试了MessagePack,但它似乎与C ++的静态类型有点尴尬。 What I came up with isn't clever but it works great. 我想出的不是很聪明,但是效果很好。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.