基准测试Newtonsoft.Json反序列化：来自流和字符串

Question

I'm interested in performance (speed, memory usage) comparison of two approaches how to deserialize HTTP response JSON payload using Newtonsoft.Json . 我对如何使用Newtonsoft.Json反序列化HTTP响应JSON有效负载的两种方法的性能（速度，内存使用）比较感兴趣。

I'm aware of Newtonsoft.Json's Performance Tips to use streams, but I wanted to know more and have hard numbers. 我知道Newtonsoft.Json的使用流的性能提示，但我想知道更多并且有更多的数字。 I've written simple benchmark using BenchmarkDotNet , but I'm bit puzzled by results (see numbers below). 我使用BenchmarkDotNet编写了简单的基准测试，但我对结果感到有些困惑（见下面的数字）。

What I got: 我得到了什么：

parsing from stream is always faster, but not really much 从流中解析总是更快，但不是很多
parsing small and "medium" JSON has better or equal memory usage when using string as input 解析小和“中”JSON在使用字符串作为输入时具有更好或相等的内存使用
significant difference in memory usage starts to be seen with large JSON (where string itself ends up in LOH) 大型JSON开始出现内存使用量的显着差异（其中字符串本身最终在LOH中）

I didn't have time to do proper profiling (yet), I'm bit surprised by memory overhead with stream approach (if there's no error). 我没有时间进行适当的分析（但是），我对使用流方法的内存开销感到惊讶（如果没有错误）。 Whole code is here . 整个代码在这里。

? ？

Is my approach correct? 我的方法是否正确？ (usage of MemoryStream ; simulating HttpResponseMessage and its content; ...) （使用MemoryStream ;模拟HttpResponseMessage及其内容; ......）
Is there any issue with benchmarking code? 基准测试代码有什么问题吗？
Why do I see such results? 为什么我会看到这样的结果？

Benchmark setup 基准设置

I'm preparing MemoryStream to be used over and over within benchmark run: 我正在准备MemoryStream在基准测试运行中反复使用：

[GlobalSetup]
public void GlobalSetup()
{
    var resourceName = _resourceMapping[typeof(T)];
    using (var resourceStream = Assembly.GetExecutingAssembly().GetManifestResourceStream(resourceName))
    {
        _memory = new MemoryStream();
        resourceStream.CopyTo(_memory);
    }

    _iterationRepeats = _repeatMapping[typeof(T)];
}

Stream deserialization 流反序列化

[Benchmark(Description = "Stream d13n")]
public async Task DeserializeStream()
{
    for (var i = 0; i < _iterationRepeats; i++)
    {
        var response = BuildResponse(_memory);

        using (var streamReader = BuildNonClosingStreamReader(await response.Content.ReadAsStreamAsync()))
        using (var jsonReader = new JsonTextReader(streamReader))
        {
            _serializer.Deserialize<T>(jsonReader);
        }
    }
}

String deserialization 字符串反序列化

We first read JSON from stream to string, and then run deserialization - another string is being allocated, and after that used for deserialization. 我们首先从流到字符串读取JSON，然后运行反序列化 - 正在分配另一个字符串，然后用于反序列化。

[Benchmark(Description = "String d13n")]
public async Task DeserializeString()
{
    for (var i = 0; i < _iterationRepeats; i++)
    {
        var response = BuildResponse(_memory);

        var content = await response.Content.ReadAsStringAsync();
        JsonConvert.DeserializeObject<T>(content);
    }
}

Common methods 常用方法

private static HttpResponseMessage BuildResponse(Stream stream)
{
    stream.Seek(0, SeekOrigin.Begin);

    var content = new StreamContent(stream);
    content.Headers.ContentType = new MediaTypeHeaderValue("application/json");

    return new HttpResponseMessage(HttpStatusCode.OK)
    {
        Content = content
    };
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static StreamReader BuildNonClosingStreamReader(Stream inputStream) =>
    new StreamReader(
        stream: inputStream,
        encoding: Encoding.UTF8,
        detectEncodingFromByteOrderMarks: true,
        bufferSize: 1024,
        leaveOpen: true);

Results 结果

Small JSON 小JSON

Repeated 10000 times 重复10000次

Stream: mean 25.69 ms, 61.34 MB allocated 流：平均25.69毫秒，分配61.34 MB
String: mean 31.22 ms, 36.01 MB allocated 字符串：平均31.22毫秒，分配36.01 MB

Medium JSON 中等JSON

Repeated 1000 times 重复1000次

Stream: mean 24.07 ms, 12 MB allocated 流：平均24.07毫秒，分配12 MB
String: mean 25.09 ms, 12.85 MB allocated 字符串：平均25.09毫秒，分配12.85 MB

Large JSON 大JSON

Repeated 100 times 重复100次

Stream: mean 229.6 ms, 47.54 MB allocated, objects got to Gen 1 流：平均229.6毫秒，分配47.54 MB，对象到达第1代
String: mean 240.8 ms, 92.42 MB allocated, objects got to Gen 2! 字符串：平均240.8毫秒，分配92.42 MB，对象到达第2代！

Update 更新

I went trough source of JsonConvert and found out that it internally uses JsonTextReader with StringReader when deserializing from string : JsonConvert:816 . 我去的低谷源JsonConvert ，并发现它在内部使用JsonTextReader与StringReader从反序列化时string ： JsonConvert：816 。 Stream is involved there as well (of course!). 流也参与其中（当然！）。

Then I decided to dig more into StreamReader itself and I was stunned at first sight - it is always allocating array buffer ( byte[] ): StreamReader:244 , which explains its memory use. 然后我决定深入挖掘StreamReader本身，我一见钟情就惊呆了 - 它总是分配数组缓冲区（ byte[] ）： StreamReader：244 ，它解释了它的内存使用。

This gives me answer to "why". 这让我回答“为什么”。 Solution is simple - use smaller buffer size when instantiating StreamReader - minimum buffer size defaults to 128 (see StreamReader.MinBufferSize ), but you can supply any value > 0 (check one of ctor overload). 解决方案很简单 - 在实例化StreamReader时使用较小的缓冲区大小 - 最小缓冲区大小默认为128（请参阅StreamReader.MinBufferSize ），但您可以提供任何> 0值（检查ctor重载之一）。

Of course buffer size has effect on processing data. 当然缓冲器大小对处理数据的效果。 Answering what buffer size I should then use: it depends . 回答我应该使用的缓冲区大小： 它取决于 。 When expecting smaller JSON responses, I think it is safe to stick with small buffer. 当期望更小的JSON响应时，我认为坚持使用小缓冲区是安全的。

Answer 1

After some fiddling I found reason behind memory allocation when using StreamReader . 在一些摆弄之后，我发现在使用StreamReader时内存分配背后的原因。 Original post is updated, but recap here: 原帖更新，但请回顾一下：

StreamReader uses default bufferSize set to 1024. Every instantiation of StreamReader then allocates byte array of that size. StreamReader使用默认的bufferSize设置为1024.然后， StreamReader每个实例化都会分配该大小的字节数组。 That's the reason why I saw such numbers in my benchmark. 这就是为什么我在我的基准测试中看到这样的数字的原因。

When I set bufferSize to its lowest possible value 128 , results seem to be much better. 当我将bufferSize设置为其最低可能值128 ，结果似乎要好得多。

基准测试Newtonsoft.Json反序列化：来自流和字符串

问题描述

? ？

Benchmark setup 基准设置

Stream deserialization 流反序列化

String deserialization 字符串反序列化

Common methods 常用方法

Results 结果

Small JSON 小JSON

Medium JSON 中等JSON

Large JSON 大JSON

Update 更新

1 个解决方案

解决方案1
2 已采纳 2019-06-08 19:05:30

基准测试Newtonsoft.Json反序列化：来自流和字符串

问题描述

? ？

Benchmark setup 基准设置

Stream deserialization 流反序列化

String deserialization 字符串反序列化

Common methods 常用方法

Results 结果

Small JSON 小JSON

Medium JSON 中等JSON

Large JSON 大JSON

Update 更新

1 个解决方案

解决方案1 2 已采纳 2019-06-08 19:05:30

解决方案1
2 已采纳 2019-06-08 19:05:30