简体   繁体   中英

How to save & append to a serialized MessagePack binary file in C#?

I'm trying to use MessagePack to save multiple lists of structs because I read that its performance is better than BinaryFormatter serialization.

What I want to do is to receive real-time time series data and to regularly save(append) it to disk time to time, for example, if the number of elements of a list is 100. My questions are:

1) Is it better to serialize lists of structs and save it to disk asynchronously in this scenario?

2) How to simply save it to disk with MessagePack?

public struct struct_realTime
{
    public int indexNum { get; set; }
    public string currentTime { get; set; }
    public string currentType { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        List<struct_realTime> list_temp = new List<struct_realTime>(100000);

        for (int num=0; num < 100000; num++)
        {
            list_temp.Add(new struct_realTime
            {
                indexNum = 1,
                currentTime = "time",
                currentType = "type",
            });
        }

        string filename = "file.bin";

        using (var fileStream = new FileStream(filename, FileMode.Append, FileAccess.Write))
        {
            byte[] bytes = MessagePackSerializer.Serialize(list_temp);
            Console.WriteLine(MessagePackSerializer.ToJson(bytes));
        }
    }
}

When I run this code, it creates file.bin and prints out 100000 structs, but the file is 0 byte.

When I use BinaryFormatter , I do this:

using (var fileStream = new FileStream("file.bin", FileMode.Append))
{
    BinaryFormatter formatter = new BinaryFormatter();
    formatter.Serialize(fileStream, list_temp);
}

How can I fix the problem?

What you are trying to do is to append an object (here List<struct_realTime> ) serialized using MessagePackSerializer to a file containing an already-serialized sequence of similar objects, in the same way it is possible with BinaryFormatter , protobuf-net or Json.NET . Later, you presumably want to be able to deserialize the entire sequence into a list or array of objects of the same type.

Your code has three problems, two simple and one fundamental.

The simple problems are as follows:

  • You don't actually write to the fileStream . Instead, do the following:

     // Append each list_temp sequentially using (var fileStream = new FileStream(filename, FileMode.OpenOrCreate, FileAccess.ReadWrite)) { MessagePackSerializer.Serialize(fileStream, list_temp); }
  • You haven't marked struct_realTime with [MessagePackObject] attributes . This can be implemented eg as follows:

     [MessagePackObject] public struct struct_realTime { [Key(0)] public int indexNum { get; set; } [Key(1)] public string currentTime { get; set; } [Key(2)] public string currentType { get; set; } }

Having done that, you can now repeatedly serialize list_temp to a file... but you will not be able to read them afterwards! That's because MessagePackSerializer seems to read the entire file when deserializing the root object, skipping over any additional data appended in the file. Thus code like the following will fail, because only one object gets read from the file:

List<List<struct_realTime>> allItemsInFile = new List<List<struct_realTime>>();
using (var fileStream = File.OpenRead(filename))
{
    while (fileStream.Position < fileStream.Length)
    {
        allItemsInFile.Add(MessagePackSerializer.Deserialize<List<struct_realTime>>(fileStream));                   
    }
}
Assert.IsTrue(allItemsInFile.Count == expectedNumberOfRootItemsInFile);

Demo fiddle #1 here .

And code like the following will fail because the (first) root object in the stream is not an array of arrays of objects, but rather just a single array:

List<List<struct_realTime>> allItemsInFile;
using (var fileStream = File.OpenRead(filename))
{
    allItemsInFile = MessagePackSerializer.Deserialize<List<List<struct_realTime>>>(fileStream);
}
Assert.IsTrue(allItemsInFile.Count == expectedNumberOfRootItemsInFile);

Demo fiddle #2 here .

As MessagePackSerializer seems to lack the ability to deserialize multiple root objects from a stream, what are your options? Firstly, you could deserialize a List<List<struct_realTime>> , append to it, and then serialize the entire thing back to the file. Presumably you don't want to do that for performance reasons.

Secondly, using the MessagePack specification directly, you could manually seek to the beginning of the file to parse and rewrite an appropriate array 32 format header , then seek to the end of the file and use MessagePackSerializer to serialize and append the new item. The following extension method does the job:

public static class MessagePackExtensions
{
    const byte Array32 = 0xdd;
    const int Array32HeaderLength = 5;

    public static void AppendToFile<T>(Stream stream, T item)
    {
        if (stream == null)
            throw new ArgumentNullException(nameof(stream));
        if (!stream.CanSeek)
            throw new ArgumentException("!stream.CanSeek");

        stream.Position = 0;
        var buffer = new byte[Array32HeaderLength];
        var read = stream.Read(buffer, 0, Array32HeaderLength);
        stream.Position = 0;
        if (read == 0)
        {
            FormatArray32Header(buffer, 1);
            stream.Write(buffer, 0, Array32HeaderLength);
        }
        else
        {
            var count = ParseArray32Header(buffer, read);
            FormatArray32Header(buffer, count + 1);
            stream.Write(buffer, 0, Array32HeaderLength);
        }

        stream.Position = stream.Length;
        MessagePackSerializer.Serialize(stream, item);
    }

    static void FormatArray32Header(byte [] buffer, uint value)
    {
        buffer[0] = Array32;
        buffer[1] = unchecked((byte)(value >> 24));
        buffer[2] = unchecked((byte)(value >> 16));
        buffer[3] = unchecked((byte)(value >> 8));
        buffer[4] = unchecked((byte)value);
    }

    static uint ParseArray32Header(byte [] buffer, int readCount)
    {
        if (readCount < 5 || buffer[0] != Array32)
            throw new ArgumentException("Stream was not positioned on an Array32 header.");
        int i = 1;
        //https://stackoverflow.com/questions/8241060/how-to-get-little-endian-data-from-big-endian-in-c-sharp-using-bitconverter-toin
        //https://stackoverflow.com/a/8241127 by https://stackoverflow.com/users/23354/marc-gravell
        var value = unchecked((uint)((buffer[i++] << 24) | (buffer[i++] << 16) | (buffer[i++] << 8) | buffer[i++]));
        return value;
    }
}

It can be used to append your list_temp as follows:

// Append each entry sequentially
using (var fileStream = new FileStream(filename, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
    MessagePackExtensions.AppendToFile(fileStream, list_temp);
}

And then later, to deserialize the entire file, do:

List<List<struct_realTime>> allItemsInFile;
using (var fileStream = File.OpenRead(filename))
{
    allItemsInFile = MessagePackSerializer.Deserialize<List<List<struct_realTime>>>(fileStream);
}

Notes:

Demo fiddle #3 here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM