了解“通过引用序列化”的概念

Question

I'm writing my own binary serializer optimized for game development. 我正在编写自己的针对游戏开发优化的二进制序列化程序。 So far it's fully functional. 到目前为止，它已经完全可用。 It emits IL to generate the [de]serialization methods given a sequence of types in advance. 预先发出给定类型的序列，它会发出IL以生成反序列化方法。 The only missing feature is serializing things by reference, everything is currently being serialized by value. 唯一缺少的功能是通过引用序列化事物，当前所有事物都通过值序列化。

In order to implement it, I have to understand it first. 为了实现它，我必须先了解它。 This is what I'm finding to be a bit tricky. 这是我发现有些棘手的问题。 Let me show you what I understood in these couple of examples: 让我向您展示我在以下两个示例中了解的内容：

Example 1 (as seen here ): 实施例1（如图这里）：

public class Person
{
    public string Name;
    public Person Friend;
}

static void Main(string[] args)
{
    Person p1 = new Person();
    p1.Name = "John";

    Person p2 = new Person();
    p2.Name = "Mike";

    p1.Friend = p2;

    Person[] group = new Person[] { p1, p2 };

    var serializer = new DataContractSerializer(group.GetType(), null, 
        0x7FFF /*maxItemsInObjectGraph*/, 
        false /*ignoreExtensionDataObject*/, 
        true /*preserveObjectReferences : this is where the magic happens */, 
        null /*dataContractSurrogate*/);

    serializer.WriteObject(Console.OpenStandardOutput(), group);
}

Now this is completely understood. 现在，这已被完全理解。 We have a root object which is the array, referencing two unique persons. 我们有一个根对象，即数组，引用了两个唯一的人。 The p1.Friend happens to be the p2 . p1.Friend恰好是p2 。 So instead of serializing the p1.Friend by value we just store an id that points to p2 which we've already serialized. 因此，我们p1.Friend按值序列化p1.Friend ，而是存储了一个指向已经序列化的p2的id。

However; 然而; have a look at this second example: 看一下第二个例子：

    static void Example2()
    {
        var p1 = new Person() { Name = "Diablo" };
        var p2 = new Person() { Name = "Mephesto" };

        p1.Friend = p2;

        var serializer = new DataContractSerializer(typeof(Person), null, 0x7FFF, false, true, null);

        serializer.WriteObject(Console.OpenStandardOutput(), p1);
        Console.WriteLine("\n");
        serializer.WriteObject(Console.OpenStandardOutput(), p2);
    }

Now, according to my understanding: when serializing p1 the serializer will serialize p1.Name and p1.Friend . 现在，根据我的理解：序列化p1 ，序列化程序将序列化p1.Name和p1.Friend 。 In the second WriteObject , the serializer has already serialized p2 (which is p1.Friend ) so it just serializes an id that points to p1.Friend instead of serializing it by value. 在第二个WriteObject ，序列化程序已经序列化了p2 （即p1.Friend ），因此它只是序列化了一个指向p1.Friend的ID，而不是按值p1.Friend进行序列化。

Running the code and viewing the output it doesn't seem to be the case. 运行代码并查看输出似乎并非如此。 In the 2nd output we see the serializer serializing p2 by value as if it hasn't came across it yet... And that I didn't get. 在第二个输出中，我们看到序列化器按值对p2进行了序列化，好像还没有遇到过……而我没有得到。 It's like there's an id counter internally that gets reset at the end of WriteObject 就像内部有一个ID计数器，该计数器在WriteObject的末尾被重置

在此处输入图片说明

Here's another similar example: 这是另一个类似的示例：

    static void Example3()
    {
        var p1 = new Person() { Name = "Diablo" };
        var p2 = p1;

        var serializer = new DataContractSerializer(typeof(Person), null, 0x7FFF, false, true, null);

        serializer.WriteObject(Console.OpenStandardOutput(), p1);
        Console.WriteLine("\n");
        serializer.WriteObject(Console.OpenStandardOutput(), p2);
    }

Again, the second output shows that we're serializing p2 as if we haven't encountered a definition for it yet. 再次，第二个输出显示我们正在序列化p2 ，就好像尚未遇到它的定义一样。

Note that I didn't choose DataContractSerializer for any particular reason, any serializer that supports serializing by reference works. 请注意，出于任何特定原因，我都没有选择DataContractSerializer ，任何支持按参考进行序列化的序列化器。

I tried to ILSpy on DataContractSerializer but I got lost quickly and couldn't figure out much. 我尝试在DataContractSerializer上使用ILSpy，但很快就迷路了，无法解决太多问题。

In Example2 , why didn't the serializer store an id to p1.Friend when serializing p2 ? 在Example2 ，为什么序列化p1.Friend在序列化p2时没有将ID存储到p1.Friend ？ - Is 'serializing by reference' only applied to a single object hierarchy, or how does it work in general? -“按引用序列化”仅适用于单个对象层次结构，还是通常如何工作？
It seems to me that serializing by reference will automatically handle circular referencing (A <-> B), is that correct? 在我看来，按引用进行序列化将自动处理循环引用（A <-> B），对吗？ or do I need to do other things to make sure I won't fall into an infinite loop? 还是我需要做其他事情来确保我不会陷入无限循环？
I assume serializing by reference makes sense only when applied on reference-types and not value-types, correct? 我认为按引用进行序列化仅在应用于引用类型而不是值类型时才有意义，对吗？

I've tagged protobuf-net cause it's similar in that it's a binary serializer and emits IL. 我已将protobuf-net标记为原因，因为它类似于二进制序列化程序，并发出IL，因此类似。 I would love to hear how seiralizing by reference is implemented there :p 我很想听听那里如何通过引用实现讽刺性：p

Answer 1

Each call to write-object is a separate serialization context; 每个对write-object的调用都是一个单独的序列化上下文。 the reference-tracking is not preserved between calls 两次调用之间未保留参考跟踪
As long as you correctly identify previously seen values, it shouldn't get recursive, but a depth check can help avoid issues 只要您正确识别以前看到的值，它就不会递归，但是深度检查可以帮助避免问题
Correct, although you could attempt to recognise semantically identical value types if you wanted (perhaps the structural equality interface) 正确，尽管您可以根据需要尝试识别语义上相同的值类型（也许是结构相等接口）

Additional thought: if you apply this to strings, you might want to special-case as effective equality rather than reference equality - no point serialising two different instances (references) of the same string 额外的想法：如果将其应用于字符串，则可能希望将特例作为有效相等而不是引用相等-序列化同一字符串的两个不同实例（引用）没有意义

了解“通过引用序列化”的概念

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-05-01 14:15:21

了解“通过引用序列化”的概念

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-05-01 14:15:21

解决方案1
2 已采纳 2015-05-01 14:15:21