简体   繁体   English

使用 Newton Soft 从大约 5GB 的文件中反序列化非常大的复杂 JSON 数据

[英]Deserialize very large complex JSON data from a file approx 5GB using Newton Soft

I am getting a huge json data file of around 5 to 8 GB in size with following data which consist of company info and then array of employee details.我得到一个大小约为 5 到 8 GB 的巨大 json 数据文件,其中包含以下数据,其中包括公司信息和员工详细信息数组。 In 1 file, only 1 company info is coming and that file size ranges from 5GB - 8GB在 1 个文件中,只有 1 个公司信息,文件大小范围为 5GB - 8GB

I am trying to de serialize this to c# object.我正在尝试将其反序列化为 c# object。 I cannot add this whole data to a string as it will throw memory exception.我无法将整个数据添加到字符串中,因为它会引发 memory 异常。

I am using NewtonSoft Json我正在使用 NewtonSoft Json

Sample data样本数据

{
"Companyname": "ABC Company",
"email": "info@abc.com",
"location": "NYC",
"department": [
    {
        "deptid": "15345",
        "deptname":"dept1",
        "projects": ["25A","26B","26C"],

        "employees":
          [
            {
             "empid": "1",
             "name":"john",
             "groupnumber":[234234,34243,343242,2342342]
            },
            {
             "empid": "2",
             "name":"Joseph",
             "groupnumber":[13245646,78945651,45641546,78978979]
            }
          ]
    },
    {
        "deptid": "5654",
        "deptname":"dept2",
        "projects": ["125A","226B","26CD"],

        "employees":
          [
            {
             "empid": "11",
             "name":"Jill",
             "groupnumber":[13224231,123133333,8765433,213132333]
            },
            {
             "empid": "122",
             "name":"Don",
             "groupnumber":[12344,123123234]
            }
          ]
    }   
]}

Class Class

public class CompanyDetails
{
    public string companyName{ get; set; }
    public string email { get; set; }
    public string location { get; set; }
    public List<Department> department { get; set; }
}

public class Department
{
    public string deptname { get; set; }
    public int deptid{ get; set; }
    public List<Project> projects{ get; set; }
    public List<Employee> employees{ get; set; }
}

public class Project
{
    public string projectReference { get; set; }
}

public class Department
{
    public int empid { get; set; }
    public string name { get; set; }
    public List<GroupNumber> groupnumber { get; set; }
}

public class GroupNumber
{
public long grpnumber { get; set; }
}

Below is my c sharp code.下面是我的 c 清晰代码。 It's not throwing any error.它没有抛出任何错误。 But the companyData object is empty但是公司数据 object 是空的

 using (StreamReader reader = new StreamReader(file.FullName))
           {
              var serializer = new JsonSerializer();
              CompanyDetails companyData = (CompanyDetails)serializer.Deserialize(reader, typeof(CompanyDetails));
           }

Any help is much appreciated.任何帮助深表感谢。

You have to fix the classes.你必须修复课程。 For example例如

 public List<Project> projects{ get; set; }

should be应该

public List<string> projects { get; set; }

I've been using these classes and everything is working properly我一直在使用这些类,一切正常


public class CompanyDetails
{
    public string Companyname { get; set; }
    public string email { get; set; }
    public string location { get; set; }
    public List<Department> department { get; set; }
}

public class Department
{
    public string deptid { get; set; }
    public string deptname { get; set; }
    public List<string> projects { get; set; }
    public List<Employee> employees { get; set; }
}

public class Employee
{
    public string empid { get; set; }
    public string name { get; set; }
    public List<int> groupnumber { get; set; }
}
  1. Your property names in your C# models need to match keys in the JSON exactly for it to deserialize correctly. C# 模型中的属性名称需要与 JSON 中的键完全匹配才能正确反序列化。

  2. You need to change your code a bit inside the using statement to correctly deserialize the file text into your C# model.您需要在 using 语句中稍微更改代码,以将文件文本正确反序列化为 C# model。 See this link https://www.newtonsoft.com/json/help/html/Performance.htm#MemoryUsage shared by @dbc in the comments below.请参阅以下评论中@dbc 共享的此链接https://www.newtonsoft.com/json/help/html/Performance.htm#MemoryUsage

  3. Other than that, you're doing great.除此之外,你做得很好。

EDIT Explained编辑解释

So as to avoid being raked over the coals again, I overhauled this answer.为了避免再次陷入困境,我彻底检查了这个答案。 Apparently, there are some passionate feelings about efficiency of memory usage and directly applicable solutions.显然,对于 memory 的使用效率和直接适用的解决方案,有一些热情的感受。 Forgive the non-answer and misleading direction offered before.原谅之前提供的不回答和误导性的方向。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM