Convert pandas DataFrame to arbitrary nested JSON data

Question

Assume that I have a pandas DataFrame called df that looks something like:

source      tables      columns      
src1        table1      col1       
src1        table1      col2
src1        table2      col1 
src2        table1      col1
src2        table1      col2

My current code below can iterate through the list of sources and nest the list of tables within each source as an object:

data = [
    {k: v} 

    for k, v in df.groupby('source')['tables'].agg(
        lambda x: {v: {} for v in x}).items()
    ]

    with open('data.json', 'w') as f:
        json.dump(data, f, indent = 2)

The output I'm receiving with this code is as follows:

[
  {
    "src1": {
      "table1": {},
      "table2": {}
    }
  },
  {
    "src2": {
      "table1": {},
    }
  }
]

My desired output:

[
  {
    "src1": {
      "table1": {
         "col1": {},
         "col2": {}
     },
      "table2": {
         "col1": {}
     }
    }
  },
  {
    "src2": {
      "table1": {
         "col1": {}
      }
    }
  }
]

Any assistance in converting my 2-layer nested JSON file to 3 layers as shown above would be greatly appreciated. Thank you in advance.

Answer 1

Since you have multiple levels of grouping here, I'd recommend just using a for loop to iterate over your data.

from collections import defaultdict  

def make_nested(df): 
    f = lambda: defaultdict(f)   
    data = f()  

    for row in df.to_numpy().tolist():
        t = data
        for r in row[:-1]:
            t = t[r]
        t[row[-1]] = {}

    return data

print(json.dumps(make_nested(df), indent=2))
{
  "src1": {
    "table1": {
      "col1": {},
      "col2": {}
    },
    "table2": {
      "col1": {}
    }
  },
  "src2": {
    "table1": {
      "col1": {},
      "col2": {}
    }
  }
}

This assumes your columns are arranged from left to right: outermost keys to innermost key.

Convert pandas DataFrame to arbitrary nested JSON data

Question

1 answers

solution1
2 ACCPTED 2020-03-01 20:42:38

Convert pandas DataFrame to arbitrary nested JSON data

Question

1 answers

solution1 2 ACCPTED 2020-03-01 20:42:38

solution1
2 ACCPTED 2020-03-01 20:42:38