简体   繁体   中英

Binary serialization in C# (really, WYSIWYG serialization)

(for WYSIWYG I mean that I decide WHAT is written and HOW it's written, and not someone at Microsoft or at Google) (OK... Technically I don't decide anything... Someone that programmed some years ago decided, and I can only ask how much high I have to jump)

I feel a little stupid today, but I've already lost two hours looking for the solution :-(.

Now...

I have a binary protocol. It's C based, so it's like looking at a C struct where the endianness of the machine is defined (and luckily it's the same as the "local" endianness), the size of the various types are defined, the data structure alignment is defined, the layout of the struct is defined, the strings are fixed arrays of chars in a know encoding... Everything is defined! And everything is very similar to a C# unsafe struct when you are using [(LayoutKind.Explicit)] and you are not very picky about using the fixed modifier for arrays. Now I need to serialize/deserialize it in C#... I've looked around but I wasn't able to find anything... What have I missed? Before you ask, I know of BinaryFormatter , but it isn't WYSIWYG enough for me... BinaryFormatter implements its formatting language. Yeah, I know of BitConverter (and of the fact that it doesn't implement the converters for big-endian), but it isn't a "complete" solution. It's only the "base" instrument. And I know of BinaryWriter / BinaryReader , but they don't seem to support arrays that aren't byte[] or char[] and they don't seem to be able to "pad" an array on write (you have a 5 elements byte[] array and you need to write it as a 10 elements byte[] array because the format you are using requires it... You have to write lines of code to do this)

Plan B (but perhaps even Plan Z) is to create a shadow unsafe struct for each class, a IWysiwygSerializable interface with two methods ( Read and Write ) and implement the interface in every class (the write would populate the unsafe struct and write it in the output stream, the read would do the opposite) (or I could even do directly some tens of BitConverter in the Read and Write without using the struct , but for arrays it's a little more difficult)

Thanks!

I wrote a rather simple but extensible framework for declarative binary serialization. I've used it extensively in my job and always found that it saves a tremendous amount of effort:

binaryserializer.com

Don't use the BinaryFormatter. Instead use BinaryWriter and BinaryReader to write the exact bytes to disk that you want written to disk, in the exact order you want. If arrays aren't handled the way you like, then you'll just have to loop through the array yourself. To make that look cleaner, you could perhaps write an extension method to do the loop.

(note that this perhaps could be considered advertising, but the "product" is open sourced MIT licensed and even the other referenced "product" is open sourced MIT licensed) (note that I'm the author of the advertised "product" and of the other referenced "product")

There wasn't any "good" solution so I have done mine :-) I had to create a library just to create the library: FluentSerializer . The library can be used to create a "description" of how you want your binary data to be serialized. This description is written in a fluent notation. You can (through the other library I have written, FluentStatement ) include in your fluent description all the usual statements like while , if , for ... (clearly using even there a fluent notation). Your description is then compiled as an Expression Tree and then as a group of dynamic methods (Serialize, Deserialize and Size (of serialized data) ).

A small sample of the serializer for a test class

/// <summary>
/// 
/// </summary>
public class Serializer : ISerializer<MyClass, EmptyParameters>
{
    #region ISerializer<MyClass,EmptyParameters> Members

    /// <summary>
    /// 
    /// </summary>
    /// <returns></returns>
    public Expression<Serializer<MyClass, EmptyParameters>> GetSerializer()
    {
        return (op, obj, par) => Statement.Start(fl => fl
            .Serialize(obj.Version)

            // Static objects can be serialized/deserialized.
            .Serialize(MyClass.StaticReadonlyInts1, typeof(FixedLength<>))

            // So can readonly collections.
            .Serialize(obj.ReadonlyInts1, typeof(FixedLength<>))

            // Both array and List<> (and Dictionary<,>, and SortedDictionary<,>, and
            // many other types of collections)
            ////.Serialize(obj.ReadonlyList1)
            .Serialize(obj.ReadonlyList1, typeof(VariableLengthByte<>))

            ////// Readonly fields can be serialized/deserialized.
            ////// Sadly you can't Dump() serializers that replace read only fields
            ////// (replace is the keyword here, readonly int X is a no-no, 
            ////// readonly List<int> X is a yes, readonly int[] X is a yes if it's 
            ////// FixedLength<>.
            ////.Serialize(obj.ReadonlyInt1)

            .Serialize(obj.Bool1)
            .Serialize(obj.Int2)

            // This will be serialized/deserialized only if obj.Version != 0
            // It's only an example of what you can do. You can use the full power of
            // FluentStatement, and remember that if instead of EmptyParameters you
            // had used another class as the parameters, you could have manipulated it
            // through the par object, so par.Version for example.
            .If(obj.Version != 0, fl.Serialize(obj.Int3))

            // This if(s) depend on the operation that is being done
            // (serialization/deserialization/size)
            .If(op == Operation.Serialize, fl.Serialize(obj.Int2))
            .If(op == Operation.Deserialize, fl.Serialize(obj.Int3))

            .Serialize(obj.Short1)

            // Tuples are supported.
            .Serialize(obj.Tuple1)

            // Arrays need to have the length prepended. There are helpers for this.
            // The default helper can be specified in the Serializer<T> constructor and
            // will be used if the field serializer isn't specified.
            ////.Serialize(obj.Ints1)

            // Or you can specify it:
            .Serialize(obj.Ints2, typeof(VariableLengthByte<>))
            .Serialize(obj.Ints3, typeof(VariableLengthByte<int[]>))

            // Nullable types are supported
            .Serialize(obj.NullableInt1, typeof(Nullable<int>))
            ////.Serialize(obj.NullableInt2)

            // But note that you could even use the Optional<> with value types,
            // usefull for example if you have to use a modifier that is a class
            // (like VariableLengthInt32 for example)
            .Serialize(obj.NullableInt1, typeof(Optional<int>))
            .Serialize(obj.NullableInt2, typeof(Optional<>))

            // So are "optional" objects (fields that can be null)
            // (Note that here if we wanted to specify the helper, we would have
            // to use typeof(Optional<VariableLengthByte<int>>)
            .Serialize(obj.OptionalInts1, typeof(Optional<VariableLengthInt32<int[]>>))
            .Serialize(obj.OptionalInts2, typeof(Optional<>))
            .Serialize(obj.OptionalList1, typeof(Optional<VariableLengthInt32<List<int>>>))
            .Serialize(obj.OptionalList2, typeof(Optional<>))

            // You can serialize a DateTime as the full .NET value
            .Serialize(obj.DateTime1)

            // Or, for example, as an Unix datetime (32 or 64 bits)
            .Serialize(obj.DateTime2, typeof(UnixDateTime<int>))

            .Serialize(obj.Float1)
            .Serialize(obj.Double1)
            .Serialize(obj.Decimal1)
            .Serialize(obj.TimeSpan1)

            // For strings it's a little more complex. There are too many combinations 
            // of possible formats (encoding x string length * (use char or byte length))
            // At this time there isn't any helper for C strings (null terminated strings).
            // You have to "manually" register you string formats.
            .Serialize(obj.String1, typeof(Program.MyUtf8VariableLengthInt32String))
            .Serialize(obj.String2, typeof(Program.MyAsciiVariableLengthInt32String))
            .Serialize(obj.String3, typeof(Program.MyUnicodeVariableLengthInt32String))

            // Chain serializing the base class can be done in this way
            .Serialize(obj, typeof(MySimpleClass))

            // This is only to make it easy to add new serialization fields. The last ) is
            // "attached" to the .Empty and doesn't need to be moved.
            .Empty());
    }

    #endregion
}

Clearly this library is good only if you have to serialize/deserialize a lot of data. If you only have a single object to serialize/deserialize, BinaryReader / BinaryWriter are probably enough for you (as suggested by me in the original question and by Fantius in his answer).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM