简体   繁体   English

XmlSerializer和BinaryFormatter之间有什么区别

[英]What are the differences between the XmlSerializer and BinaryFormatter

I spent a good portion of time last week working on serialization. 我上周花了很长时间研究序列化。 During that time I found many examples utilizing either the BinaryFormatter or XmlSerializer. 在那段时间里,我发现了很多使用BinaryFormatter或XmlSerializer的例子。 Unfortunately, what I did not find were any examples comprehensively detailing the differences between the two. 不幸的是,我没有找到任何全面详细说明两者之间差异的例子。

The genesis of my curiosity lies in why the BinaryFormatter is able to deserialize directly to an interface whilst the XmlSerializer is not. 我的好奇心的起源在于为什么BinaryFormatter能够直接反序列化到接口而XmlSerializer不能。 Jon Skeet in an answer to " casting to multiple (unknown types) at runtime " provides an example of direct binary serialization to an interface. Jon Skeet在回答“ 在运行时转换为多个(未知类型) ”时提供了直接二进制序列化到接口的示例。 Stan R. provided me with the means of accomplishing my goal using the XmlSerializer in his answer to " XML Object Deserialization to Interface ." Stan R.在他对“ XML Object Deserialization to Interface ”的回答中使用XmlSerializer为我提供了实现目标的方法。

Beyond the obvious of the BinaryFormatter utilizes binary serialization whilst the XmlSerializer uses XML I'd like to more fully understand the fundamental differences. 除了明显的BinaryFormatter利用二进制序列化,而XmlSerializer使用XML我想更全面地理解基本差异。 When to use one or the other and the pros and cons of each. 何时使用其中一种或各自的利弊。

The reason a binary formatter is able to deserialize directly to an interface type is because when an object is originally serialized to a binary stream metadata containing type and assembly information is stuck in with the object data. 二进制格式化程序能够直接反序列化为接口类型的原因是因为当一个对象最初被序列化为二进制流时,包含类型和组件信息的元数据会被对象数据卡住。 This means that when the binary formatter deserializes the object it knows its type, builds the correct object and you can then cast that to an interface type that object implements. 这意味着当二进制格式化程序反序列化它知道其类型的对象时,构建正确的对象,然后可以将其强制转换为该对象实现的接口类型。

The XML serializer on the otherhand just serializes to a schema and only serializes the public fields and values of the object and no type information other then that (eg interfaces the type implements). 另一方面,XML序列化程序只是序列化到一个模式,只序列化对象的公共字段和值,除此之外没有类型信息(例如,类型实现的接口)。

Here is a good post, .NET Serialization , comparing the BinaryFormatter , SoapFormatter , and XmlSerializer . 这是一篇很好的帖子, .NET Serialization ,比较了BinaryFormatterSoapFormatterXmlSerializer I recommend you look at the following table which in addition to the previously mentioned serializers includes the DataContractSerializer , NetDataContractSerializer and protobuf-net . 我建议您查看下表,除了前面提到的序列化程序外,还包括DataContractSerializerNetDataContractSerializerprotobuf-net

序列化比较

Just to weigh in... 只是为了权衡......

The obvious difference between the two is "binary vs xml", but it does go a lot deeper than that: 两者之间的明显区别是“二进制vs xml”,但确实比这更深入:

  • fields ( BinaryFormatter =bf) vs public members (typically properties) ( XmlSerializer =xs) fields( BinaryFormatter = bf)vs 公共成员(通常是属性)( XmlSerializer = xs)
  • type-metadata based (bf) vs contract-based (xs) 基于类型元数据(bf)与基于合同(xs)
  • version-brittle (bf) vs version-tolerant (xs) version-brittle(bf)vs版本容忍(xs)
  • "graph" (bf) vs "tree" (xs) “graph”(bf)vs“tree”(xs)
  • .NET specific (bf) vs portable (xs) .NET特定(bf)与便携式(xs)
  • opaque (bf) vs human-readable (xs) opaque(bf)vs human-readable(xs)

As a discussion of why BinaryFormatter can be brittle, see here . 作为BinaryFormatter为何易碎的讨论, 请参见此处

It is impossible to discuss which is bigger; 讨论哪个更大是不可能的; all the type metadata in BinaryFormatter can make it bigger. BinaryFormatter所有类型元数据都可以使其更大。 And XmlSerializer can work very with compression like gzip. 并且XmlSerializer可以像gzip一样非常适合压缩。

However, it is possible to take the strengths of each; 但是,可以利用每个人的优势; for example, Google have open-sourced their own data serialization format, "protocol buffers". 例如,谷歌开源了他们自己的数据序列化格式,“协议缓冲区”。 This is: 这是:

  • contract-based 合同为基础
  • portable (see list of implementations ) 便携式(参见实施列表
  • version-tolerant 版本容错
  • tree-based 基于树
  • opaque (although there are tools to show data when combined with a .proto) 不透明(尽管有一些工具可以在与.proto结合使用时显示数据)
  • typically " contract first ", but some implementations allow implicit contracts based on reflection 通常是“ 先收缩 ”,但有些实现允许基于反射的隐式契约

But importantly, it is very dense data (no type metadata, pure binary representation, short tags, tricks like variant-length base-7 encoding), and very efficient to process (no complex xml structure, no strings to match to members, etc). 但重要的是,它是非常密集的数据(没有类型元数据,纯二进制表示,短标签,变种长度base-7编码等技巧),并且处理效率很高(没有复杂的xml结构,没有与成员匹配的字符串等等) )。

I might be a little biased; 我可能有点偏颇; I maintain one of the implementations (including several suitable for C#/.NET), but you'll note I haven't linked to any specific implementation; 我维护了一个实现(包括几个适用于C#/ .NET),但你会注意到我没有链接到任何特定的实现; the format stands under its own merits ;-p 格式符合其自身的优点;-p

The XML Serializer produces XML and also an XML Schema (implicitly). XML Serializer生成XML和XML Schema(隐式)。 It will produce XML that conforms to this schema. 它将生成符合此模式的XML。

One implication is that it will not serialize anything which cannot be described in XML Schema. 一个含义是它不会序列化任何无法在XML Schema中描述的内容。 For instance, there is no way to distinguish between a list and an array in XML Schema, so the XML Schema produced by the serializer can be interpreted either way. 例如,无法区分XML Schema中的列表和数组,因此可以以任一方式解释序列化程序生成的XML Schema。

Runtime serialization (which the BinaryFormatter is part of) serializes the actual .NET types to the other side, so if you send a List<int> , the other side will get a List<int> . 运行时序列化( BinaryFormatter是其中的一部分)将实际的.NET类型序列化到另一侧,因此如果发送List<int> ,另一端将获得List<int>

That obviously works better if the other side is running .NET. 如果另一方运行.NET,这显然会更好。

The XmlSerializer serialises the type by reading all the type's properties that have both a public getter and a public setter (and also any public fields). XmlSerializer通过读取具有公共getter和public setter(以及任何公共字段)的所有类型的属性来序列化类型。 In this sense the XmlSerializer serializes/deserializes the "public view" of the instance. 从这个意义上讲,XmlSerializer序列化/反序列化实例的“公共视图”。

The binary formatter, by contrast, serializes a type by serializing the instance's "internals", ie its fields. 相反,二进制格式化程序通过序列化实例的“内部”(即其字段)来序列化类型。 Any fields that are not marked as [NonSerialized] will be serialized to the binary stream. 任何未标记为[NonSerialized]的字段都将序列化为二进制流。 The type itself must be marked as [Serializable] as must any internal fields that are also to be serialized. 类型本身必须标记为[Serializable],必须将任何内部字段标记为序列化。

I guess one of the most important ones is that binary serialization can serialize both public and private members, whereas the other one works only with public ones. 我想其中一个最重要的是二进制序列化可以序列化公共和私有成员,而另一个只能用于公共成员。

In here, it provides a very helpful comparison between these two in terms of size. 在这里,它在尺寸方面提供了这两者之间非常有用的比较。 It's a very important issue, because you might send your serialized object to a remote machine. 这是一个非常重要的问题,因为您可能会将序列化对象发送到远程计算机。

http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/ http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM