简体   繁体   English

将 Pandas DataFrame 转换为任意嵌套的 JSON 数据

[英]Convert pandas DataFrame to arbitrary nested JSON data

Assume that I have a pandas DataFrame called df that looks something like:假设我有一个名为df Pandas DataFrame,它看起来像:

source      tables      columns      
src1        table1      col1       
src1        table1      col2
src1        table2      col1 
src2        table1      col1
src2        table1      col2

My current code below can iterate through the list of sources and nest the list of tables within each source as an object:我下面的当前代码可以遍历源列表并将每个源中的表列表作为对象嵌套:

data = [
    {k: v} 

    for k, v in df.groupby('source')['tables'].agg(
        lambda x: {v: {} for v in x}).items()
    ]

    with open('data.json', 'w') as f:
        json.dump(data, f, indent = 2)

The output I'm receiving with this code is as follows:我使用此代码收到的输出如下:

[
  {
    "src1": {
      "table1": {},
      "table2": {}
    }
  },
  {
    "src2": {
      "table1": {},
    }
  }
]

My desired output:我想要的输出:

[
  {
    "src1": {
      "table1": {
         "col1": {},
         "col2": {}
     },
      "table2": {
         "col1": {}
     }
    }
  },
  {
    "src2": {
      "table1": {
         "col1": {}
      }
    }
  }
]

Any assistance in converting my 2-layer nested JSON file to 3 layers as shown above would be greatly appreciated.将我的 2 层嵌套 JSON 文件转换为 3 层(如上所示)的任何帮助将不胜感激。 Thank you in advance.先感谢您。

Since you have multiple levels of grouping here, I'd recommend just using a for loop to iterate over your data.由于您在这里有多个级别的分组,我建议您只使用 for 循环来迭代您的数据。

from collections import defaultdict  

def make_nested(df): 
    f = lambda: defaultdict(f)   
    data = f()  

    for row in df.to_numpy().tolist():
        t = data
        for r in row[:-1]:
            t = t[r]
        t[row[-1]] = {}

    return data

print(json.dumps(make_nested(df), indent=2))
{
  "src1": {
    "table1": {
      "col1": {},
      "col2": {}
    },
    "table2": {
      "col1": {}
    }
  },
  "src2": {
    "table1": {
      "col1": {},
      "col2": {}
    }
  }
}

This assumes your columns are arranged from left to right: outermost keys to innermost key.这假设您的列从左到右排列:最外面的键到最里面的键。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM