简体   繁体   中英

Efficient Manual Deserialization of Nested objects from Json using Newtonsoft in c#

I have this mesh that I want to read as fast as possible from json.

[Serializable]
public class Mesh
{
    public int[] Faces { get; set; }
    public Vec3[] Vertices { get; set; }
}

[Serializable]
public class Vec3
{
    public double X { get; set; }
    public double Y { get; set; }
    public double Z { get; set; }
}

I've read that creating a manual Serializer and Deserializer is much faster than using reflection . So I tried to create my own:

public static Mesh FromJson(JsonTextReader reader)
{
    var mesh = new Mesh();
    var currentProperty = string.Empty;
    List<int> faces = new List<int>();
    List<Vec3> Vertices = new List<Vec3>();
    double X = 0, Y = 0, Z = 0;
    while (reader.Read())
    {
        if (reader.Value != null)
        {
            if (reader.TokenType == JsonToken.PropertyName)
                currentProperty = reader.Value.ToString();

            else if (reader.TokenType == JsonToken.Integer && currentProperty == "Faces")
                faces.Add(Int32.Parse(reader.Value.ToString()));

            else if (reader.TokenType == JsonToken.Float && currentProperty == "X")
            {
                X = float.Parse(reader.Value.ToString());
            }

            else if (reader.TokenType == JsonToken.Float && currentProperty == "Y")
            {
                Y = float.Parse(reader.Value.ToString());
            }

            else if (reader.TokenType == JsonToken.Float && currentProperty == "Z")
            {
                Z = float.Parse(reader.Value.ToString());
            }
        }
        else
        {
            //Console.WriteLine("Token: {0}", reader.TokenType);

            if (reader.TokenType == JsonToken.EndObject && reader.Path.Contains("Vertices"))
                Vertices.Add(new Vec3 { X = X, Y = Y, Z = Z });

        }
    }

    mesh.Faces = faces.ToArray();
    mesh.Vertices = Vertices.ToArray();
    return mesh;
}

Even though I managed to get a good 30% improvement (was expecting a bit more TBH) while Writing, reading is giving me trouble. I feel it has to do with the nesting of Vec3 because if I run a similar logic but without Vec3 I am getting decent results ~30%. The only examples I could find were dealing with simple data structures without nesting such as this so I feel I am handling it a bit simplistic here.

A 30% speed-up when replacing automatic deserialization with manual deserialization is not unexpected. Json.NET caches all reflection results in its contract resolver so the overhead from reflection isn't as bad as you might think. There's a one-time penalty for building the contract for a given type but if you are deserializing a large file then the penalty gets amortized and the resulting contract provides delegates for fast getting and setting of property values.

That being said, I see the following issues (bugs and possible optimizations) with your code:

  1. As you read through the JSON, you aren't keeping track of the parsing state. This requires you to do reader.Path.Contains("Vertices") which is not performant. It also makes your code vulnerable to unexpected behaviors when the JSON data is not as expected.

  2. You are checking string equality for the currentProperty , but if you were to add all the expected property names to a DefaultJsonNameTable and set it at JsonTextReader.PropertyNameTable you would be able to replace those checks with reference equality checks, saving some time and some memory allocations.

    Note that PropertyNameTable was added in Json.NET 12.0.1 .

  3. X , Y , and Z are doubles, but you are parsing them as floats :

     X = float.Parse(reader.Value.ToString());

    This is a bug that will cause accuracy loss. What's more, you are parsing them in the current culture (which might have a localized decimal separator) rather than the invariant culture, which is another bug.

  4. In any event there is no need to parse the reader.Value as a double because it already should be a double . And in the case of an integer value, it should already be a long . Simply converting to the required primitive type should be sufficient and faster.

  5. Disabling automatic date recognition may save you some time.

The following version of FromJson() resolves these issues:

public static partial class MeshExtensions
{
    const string Vertices = "Vertices";
    const string Faces = "Faces";
    const string X = "X";
    const string Y = "Y";
    const string Z = "Z";

    public static Mesh FromJson(JsonTextReader reader)
    {
        var nameTable = new DefaultJsonNameTable();
        nameTable.Add(Vertices);
        nameTable.Add(Faces);
        nameTable.Add(X);
        nameTable.Add(Y);
        nameTable.Add(Z);
        reader.PropertyNameTable = nameTable;  // For performance
        reader.DateParseHandling = DateParseHandling.None;  // Possibly for performance.

        bool verticesFound = false;
        List<Vec3> vertices = null;
        bool facesFound = false;
        List<int> faces = null;

        while (reader.ReadToContent())
        {
            if (reader.TokenType == JsonToken.PropertyName && reader.Value == (object)Vertices)
            {
                if (verticesFound)
                    throw new JsonSerializationException("Multiple vertices");
                reader.ReadToContentAndAssert(); // Advance past the property name
                vertices = ReadVertices(reader); // Read the vertices array
                verticesFound = true;
            }
            else if (reader.TokenType == JsonToken.PropertyName && reader.Value == (object)Faces)
            {
                if (facesFound)
                    throw new JsonSerializationException("Multiple faces");
                reader.ReadToContentAndAssert(); // Advance past the property name
                faces = reader.ReadIntArray(); // Read the vertices array
                facesFound = true;
            }
        }

        return new Mesh
        {
            Vertices = vertices == null ? null : vertices.ToArray(),
            Faces = faces == null ? null : faces.ToArray(),
        };
    }

    static List<Vec3> ReadVertices(JsonTextReader reader)
    {
        if (reader.MoveToContentAndAssert().TokenType == JsonToken.Null)
            return null;
        else if (reader.TokenType != JsonToken.StartArray)
            throw new JsonSerializationException(string.Format("Unexpected token type {0}", reader.TokenType));
        var vertices = new List<Vec3>();
        while (reader.ReadToContent())
        {
            switch (reader.TokenType)
            {
                case JsonToken.EndArray:
                    return vertices;

                case JsonToken.Null:
                    // Or throw an exception if you prefer.
                    //throw new JsonSerializationException(string.Format("Unexpected token type {0}", reader.TokenType));
                    vertices.Add(null);
                    break;

                case JsonToken.StartObject:
                    var vertex = ReadVertex(reader);
                    vertices.Add(vertex);
                    break;

                default:
                    // reader.Skip();
                    throw new JsonSerializationException(string.Format("Unexpected token type {0}", reader.TokenType));
            }
        }
        throw new JsonReaderException(); // Truncated file.
    }

    static Vec3 ReadVertex(JsonTextReader reader)
    {
        if (reader.MoveToContentAndAssert().TokenType == JsonToken.Null)
            return null;
        else if (reader.TokenType != JsonToken.StartObject)
            throw new JsonException();
        var vec = new Vec3();
        while (reader.ReadToContent())
        {
            switch (reader.TokenType)
            {
                case JsonToken.EndObject:
                    return vec;

                case JsonToken.PropertyName:
                    if (reader.Value == (object)X)
                        vec.X = reader.ReadAsDouble().Value;
                    else if (reader.Value == (object)Y)
                        vec.Y = reader.ReadAsDouble().Value;
                    else if (reader.Value == (object)Z)
                        vec.Z = reader.ReadAsDouble().Value;
                    else // Skip unknown property names and values.
                        reader.ReadToContentAndAssert().Skip();
                    break;

                default:
                    throw new JsonSerializationException(string.Format("Unexpected token type {0}", reader.TokenType));
            }
        }
        throw new JsonReaderException(); // Truncated file.
    }
}

public static class JsonExtensions
{
    public static List<int> ReadIntArray(this JsonReader reader)
    {
        if (reader.MoveToContentAndAssert().TokenType == JsonToken.Null)
            return null;
        else if (reader.TokenType != JsonToken.StartArray)
            throw new JsonReaderException(string.Format("Unexpected token type {0}", reader.TokenType));

        var list = new List<int>();
        // ReadAsInt32() reads the next token as an integer, skipping comments
        for (var value = reader.ReadAsInt32(); true; value = reader.ReadAsInt32())
        {
            if (value != null)
                list.Add(value.Value);
            else 
                // value can be null if we reached the end of the array, encountered a null value, or encountered the end of a truncated file.
                // JsonReader will throw an exception on most types of malformed file, but not on a truncated file.
                switch (reader.TokenType)
                {
                    case JsonToken.EndArray:
                        return list;
                    case JsonToken.Null:
                    default:
                        throw new JsonReaderException(string.Format("Unexpected token type {0}", reader.TokenType));
                }
        }
    }

    public static bool ReadToContent(this JsonReader reader)
    {
        if (reader == null)
            throw new ArgumentNullException();
        if (!reader.Read())
            return false;
        while (reader.TokenType == JsonToken.Comment) // Skip past comments.
            if (!reader.Read())
                return false;
        return true;
    }

    public static JsonReader ReadToContentAndAssert(this JsonReader reader)
    {
        return reader.ReadAndAssert().MoveToContentAndAssert();
    }

    public static JsonReader MoveToContentAndAssert(this JsonReader reader)
    {
        if (reader == null)
            throw new ArgumentNullException();
        if (reader.TokenType == JsonToken.None)       // Skip past beginning of stream.
            reader.ReadAndAssert();
        while (reader.TokenType == JsonToken.Comment) // Skip past comments.
            reader.ReadAndAssert();
        return reader;
    }

    public static JsonReader ReadAndAssert(this JsonReader reader)
    {
        if (reader == null)
            throw new ArgumentNullException();
        if (!reader.Read())
            throw new JsonReaderException("Unexpected end of JSON stream.");
        return reader;
    }
}

Demo fiddle here . As you can see properly handling boundary conditions such as comments, unexpected properties and truncated streams makes writing robust manual deserialization code tricky.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM