简体   繁体   English

行分隔的 json 序列化和反序列化

[英]Line delimited json serializing and de-serializing

I am using JSON.NET and C# 5. I need to serialize/de-serialize list of objects into line delimited json.我正在使用 JSON.NET 和 C# 5。我需要将对象列表序列化/反序列化为行分隔的 json。 http://en.wikipedia.org/wiki/Line_Delimited_JSON . http://en.wikipedia.org/wiki/Line_Delimited_JSON Example,例子,

{"some":"thing1"}
{"some":"thing2"}
{"some":"thing3"}

and

{"kind": "person", "fullName": "John Doe", "age": 22, "gender": "Male", "citiesLived": [{ "place": "Seattle", "numberOfYears": 5}, {"place": "Stockholm", "numberOfYears": 6}]}
{"kind": "person", "fullName": "Jane Austen", "age": 24, "gender": "Female", "citiesLived": [{"place": "Los Angeles", "numberOfYears": 2}, {"place": "Tokyo", "numberOfYears": 2}]}

Why I needed because its Google BigQuery requirement https://cloud.google.com/bigquery/preparing-data-for-bigquery为什么我需要,因为它的 Google BigQuery 要求https://cloud.google.com/bigquery/preparing-data-for-bigquery

Update: One way I found is that serialize each object seperataly and join in the end with new-line.更新:我发现的一种方法是单独序列化每个对象并在最后加入换行符。

You can do so by manually parsing your JSON using JsonTextReader and setting the SupportMultipleContent flag to true .您可以通过使用JsonTextReader手动解析您的 JSON 并将SupportMultipleContent标志设置为truetrue

If we look at your first example, and create a POCO called Foo :如果我们查看您的第一个示例,并创建一个名为Foo的 POCO:

public class Foo
{
    [JsonProperty("some")]
    public string Some { get; set; }
}

This is how we parse it:这是我们解析它的方式:

var json = "{\"some\":\"thing1\"}\r\n{\"some\":\"thing2\"}\r\n{\"some\":\"thing3\"}";
var jsonReader = new JsonTextReader(new StringReader(json))
{
    SupportMultipleContent = true // This is important!
};

var jsonSerializer = new JsonSerializer();
while (jsonReader.Read())
{
    Foo foo = jsonSerializer.Deserialize<Foo>(jsonReader);
}

If you want list of items as result simply add each item to a list inside the while loop to your list.如果您想要项目列表作为结果,只需将每个项目添加到while循环内的列表中即可。

listOfFoo.Add(jsonSerializer.Deserialize<Foo>(jsonReader));

Note: with Json.Net 10.0.4 and later same code also supports comma separated JSON entries see How to deserialize dodgy JSON (with improperly quoted strings, and missing brackets)?注意:对于 Json.Net 10.0.4 及更高版本,相同的代码也支持逗号分隔的 JSON 条目,请参阅如何反序列化狡猾的 JSON(带有不正确引用的字符串和缺少括号)? ) )

To implement with .NET 5 (C# 9) and the System.Text.Json.JsonSerializer class, and for "big" data, I wrote code for streaming processing.为了使用 .NET 5 (C# 9) 和System.Text.Json.JsonSerializer类以及“大”数据实现,我编写了用于流处理的代码。

Using the System.IO.Pipelines extension package, this is quite efficient.使用System.IO.Pipelines扩展包,这是非常有效的。

using System;
using System.Buffers;
using System.Collections.Generic;
using System.IO;
using System.IO.Pipelines;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

class Program
{
    static readonly byte[] NewLineChars = {(byte)'\r', (byte)'\n'};
    static readonly byte[] WhiteSpaceChars = {(byte)'\r', (byte)'\n', (byte)' ', (byte)'\t'};

    private static async Task Main()
    {
        JsonSerializerOptions jsonOptions = new(JsonSerializerDefaults.Web);
        var json = "{\"some\":\"thing1\"}\r\n{\"some\":\"thing2\"}\r\n{\"some\":\"thing3\"}";
        var contentStream = new MemoryStream(Encoding.UTF8.GetBytes(json));
        var pipeReader = PipeReader.Create(contentStream);
        await foreach (var foo in ReadItemsAsync<Foo>(pipeReader, jsonOptions))
        {
            Console.WriteLine($"foo: {foo.Some}");
        }
    }

    static async IAsyncEnumerable<TValue> ReadItemsAsync<TValue>(PipeReader pipeReader, JsonSerializerOptions jsonOptions = null)
    {
        while (true)
        {
            var result = await pipeReader.ReadAsync();
            var buffer = result.Buffer;
            bool isCompleted = result.IsCompleted;
            SequencePosition bufferPosition = buffer.Start;
            while (true)
            {
                var(value, advanceSequence) = TryReadNextItem<TValue>(buffer, ref bufferPosition, isCompleted, jsonOptions);
                if (value != null)
                {
                    yield return value;
                }

                if (advanceSequence)
                {
                    pipeReader.AdvanceTo(bufferPosition, buffer.End); //advance our position in the pipe
                    break;
                }
            }

            if (isCompleted)
                yield break;
        }
    }

    static (TValue, bool) TryReadNextItem<TValue>(ReadOnlySequence<byte> sequence, ref SequencePosition sequencePosition, bool isCompleted, JsonSerializerOptions jsonOptions)
    {
        var reader = new SequenceReader<byte>(sequence.Slice(sequencePosition));
        while (!reader.End) // loop until we've come to the end or read an item
        {
            if (reader.TryReadToAny(out ReadOnlySpan<byte> itemBytes, NewLineChars, advancePastDelimiter: true))
            {
                sequencePosition = reader.Position;
                if (itemBytes.TrimStart(WhiteSpaceChars).IsEmpty)
                {
                    continue;
                }

                return (JsonSerializer.Deserialize<TValue>(itemBytes, jsonOptions), false);
            }
            else if (isCompleted)
            {
                // read last item
                var remainingReader = sequence.Slice(reader.Position);
                using var memoryOwner = MemoryPool<byte>.Shared.Rent((int)reader.Remaining);
                remainingReader.CopyTo(memoryOwner.Memory.Span);
                reader.Advance(remainingReader.Length); // advance reader to the end
                sequencePosition = reader.Position;
                if (!itemBytes.TrimStart(WhiteSpaceChars).IsEmpty)
                {
                    return (JsonSerializer.Deserialize<TValue>(memoryOwner.Memory.Span, jsonOptions), true);
                }
                else
                {
                    return (default, true);
                }
            }
            else
            {
                // no more items in sequence
                break;
            }
        }

        // PipeReader needs to read more
        return (default, true);
    }
}

public class Foo
{
    public string Some
    {
        get;
        set;
    }
}

Run at https://dotnetfiddle.net/6j3KGghttps://dotnetfiddle.net/6j3KGg运行

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM