简体   繁体   中英

Deserialize non-standard JSON-like format in C#

I am trying to deserialize a file containing structured data with C# to export to CSV and insert into an MSSQL Database in the end.

The file looks a bit like JSON but only the individual lines are valid JSON.
Example:

{"field1": "value1", "field2": "value2", "field3": "value3"}
{"field1": "value4", "field2": "value5", "field3": "value6"}
{"field1": "value7", "field2": "value8", "field3": "value9"}

I tried to use Newtonsoft.Json looping over the individual lines but the execution takes a very long time this way as the file given to me is very large (multiple million lines).

Alternativley I tried making this entire thing valid json altering the string using a StringBuilder but that results in loading this entire file into RAM at once which does not seem to be a reasonable option at all.

I even thought about splitting the file into chucks which are then processed concurrently but I assume there must be a cleaner option to do this?

Is this a format that is in some way standardized? What would be the smartest way to go about importing this? Should I skip the CSV and use C# to directly insert the data into the datebase?

Any help appreciated!

Given the size of the file, I'd be tempted to create a custom stream (see Implement custom stream ) that wraps the real file stream such that the first few bytes returned were "{items: [", after that the bytes for the real stream and at the end "]}". That would mean you could use your Newtonsoft.JSON library without trying to build the entire corrected JSON in memory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM