[英]Deserialize very large complex JSON data from a file approx 5GB using Newton Soft
I am getting a huge json data file of around 5 to 8 GB in size with following data which consist of company info and then array of employee details.我得到一个大小约为 5 到 8 GB 的巨大 json 数据文件,其中包含以下数据,其中包括公司信息和员工详细信息数组。 In 1 file, only 1 company info is coming and that file size ranges from 5GB - 8GB
在 1 个文件中,只有 1 个公司信息,文件大小范围为 5GB - 8GB
I am trying to de serialize this to c# object.我正在尝试将其反序列化为 c# object。 I cannot add this whole data to a string as it will throw memory exception.
我无法将整个数据添加到字符串中,因为它会引发 memory 异常。
I am using NewtonSoft Json我正在使用 NewtonSoft Json
Sample data样本数据
{
"Companyname": "ABC Company",
"email": "info@abc.com",
"location": "NYC",
"department": [
{
"deptid": "15345",
"deptname":"dept1",
"projects": ["25A","26B","26C"],
"employees":
[
{
"empid": "1",
"name":"john",
"groupnumber":[234234,34243,343242,2342342]
},
{
"empid": "2",
"name":"Joseph",
"groupnumber":[13245646,78945651,45641546,78978979]
}
]
},
{
"deptid": "5654",
"deptname":"dept2",
"projects": ["125A","226B","26CD"],
"employees":
[
{
"empid": "11",
"name":"Jill",
"groupnumber":[13224231,123133333,8765433,213132333]
},
{
"empid": "122",
"name":"Don",
"groupnumber":[12344,123123234]
}
]
}
]}
Class Class
public class CompanyDetails
{
public string companyName{ get; set; }
public string email { get; set; }
public string location { get; set; }
public List<Department> department { get; set; }
}
public class Department
{
public string deptname { get; set; }
public int deptid{ get; set; }
public List<Project> projects{ get; set; }
public List<Employee> employees{ get; set; }
}
public class Project
{
public string projectReference { get; set; }
}
public class Department
{
public int empid { get; set; }
public string name { get; set; }
public List<GroupNumber> groupnumber { get; set; }
}
public class GroupNumber
{
public long grpnumber { get; set; }
}
Below is my c sharp code.下面是我的 c 清晰代码。 It's not throwing any error.
它没有抛出任何错误。 But the companyData object is empty
但是公司数据 object 是空的
using (StreamReader reader = new StreamReader(file.FullName))
{
var serializer = new JsonSerializer();
CompanyDetails companyData = (CompanyDetails)serializer.Deserialize(reader, typeof(CompanyDetails));
}
Any help is much appreciated.任何帮助深表感谢。
You have to fix the classes.你必须修复课程。 For example
例如
public List<Project> projects{ get; set; }
should be应该
public List<string> projects { get; set; }
I've been using these classes and everything is working properly我一直在使用这些类,一切正常
public class CompanyDetails
{
public string Companyname { get; set; }
public string email { get; set; }
public string location { get; set; }
public List<Department> department { get; set; }
}
public class Department
{
public string deptid { get; set; }
public string deptname { get; set; }
public List<string> projects { get; set; }
public List<Employee> employees { get; set; }
}
public class Employee
{
public string empid { get; set; }
public string name { get; set; }
public List<int> groupnumber { get; set; }
}
Your property names in your C# models need to match keys in the JSON exactly for it to deserialize correctly. C# 模型中的属性名称需要与 JSON 中的键完全匹配才能正确反序列化。
You need to change your code a bit inside the using statement to correctly deserialize the file text into your C# model.您需要在 using 语句中稍微更改代码,以将文件文本正确反序列化为 C# model。 See this link https://www.newtonsoft.com/json/help/html/Performance.htm#MemoryUsage shared by @dbc in the comments below.
请参阅以下评论中@dbc 共享的此链接https://www.newtonsoft.com/json/help/html/Performance.htm#MemoryUsage 。
Other than that, you're doing great.除此之外,你做得很好。
EDIT Explained编辑解释
So as to avoid being raked over the coals again, I overhauled this answer.为了避免再次陷入困境,我彻底检查了这个答案。 Apparently, there are some passionate feelings about efficiency of memory usage and directly applicable solutions.
显然,对于 memory 的使用效率和直接适用的解决方案,有一些热情的感受。 Forgive the non-answer and misleading direction offered before.
原谅之前提供的不回答和误导性的方向。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.