简体繁体 English

二进制解析器还是序列化？

[英]Binary parser or serialization?

原文 2010-08-25 23:03:34 3 3 c++/ parsing/ serialization/ binary/ protocols

I want to store a graph of different objects for a game, their classes may or may not be related, they may or may not contain vectors of simple structures. 我想为游戏存储不同对象的图形，它们的类可能相关，也可能不相关，它们可能包含也可能不包含简单结构的向量。

I want parsing operation to be fast, data can be pretty big. 我希望解析操作要快，数据可以很大。
Adding new things should not be hard, and it should not break backward compatibility. 添加新内容应该不难，也不应破坏向后兼容性。
Smaller file size is kind of important 较小的文件大小很重要
Readability counts 可读性计数

By serialization I mean, making objects serialize themselves, which is effective, but I will need to write different serialization methods for different objects for that. 我所说的序列化是指使对象自己进行序列化，这是有效的，但为此我需要为不同的对象编写不同的序列化方法。

By binary parsing/composing I mean, creating a new tree of parsers/composers that holds and reads data for these objects, and passing this around to have my objects push/pull their data. 通过二进制解析/组合，我的意思是，创建一个解析器/组合器的新树，该树包含和读取这些对象的数据，并将其传递给我的对象推/拉其数据。

I can also use json, but it can be pretty slow for reading, and it is not very size effective when it comes to pretty big sets of matrices, and numbers. 我也可以使用json，但读取速度可能很慢，并且在涉及相当大的矩阵和数字集时，它的大小不是很有效。

3 个解决方案

Point by point: 逐点：

Fast Parsing: binary (since you don't necessarily have to "parse", you can just deserialize) 快速解析：二进制（由于您不必“解析”，因此可以反序列化）
Adding New Things: text 添加新内容：文本
Smaller: text (even if gzipped text is larger than binary, it won't be much larger). 更小：文本（即使压缩后的文本大于二进制文本，也不会大很多）。
Readability: text 可读性：文字

So that's three votes for text, one point for binary. 因此，文本获得三票，二进制获得一分。 Personally, I'd go with text for everything except images (and other data which is "naturally" binary). 就个人而言，除了图像（以及其他“自然”二进制数据）外，我都会使用文本作为所有内容。 Then, store everything in a big zip file (I can think of several games do this or something close to it). 然后，将所有内容存储在一个大的zip文件中（我可以想到有一些游戏可以做到这一点或接近它）。

Good reads: The Importance of Being Textual and Power Of Plain Text . 好书：纯文本的重要性和纯文本的力量。

Check out protocol buffers from Google or thrift from Apache. 从Google检出协议缓冲区，或者从Apache检取节俭的存储。 Although billed as a way to write wire protocols easily, it's basically an object serialization mechanism that can create bindings in a dozen languages, has efficient binary representation, easy versioning, fast performance, and is well-supported. 尽管它被认为是一种轻松编写有线协议的方法，但它基本上是一种对象序列化机制，可以创建多种语言的绑定，具有高效的二进制表示形式，易于版本控制，性能快速且得到了良好支持。

We're using Boost.Serialization. 我们正在使用Boost.Serialization。 Don't know how it performs next to those offered by samkass. 不知道它的性能与samkass所提供的功能相差无几。