繁体   English   中英

将 Json 与 Pandas 拼合(多个列表)

[英]Flatten Json with Pandas (multiple lists)

我想使用 json 文件返回 pandas dataframe,其中每一行都列出了所有数据。 json 文件如下所示。

{
  "building_element_group": [
    {
      "basetype": "facade",
      "building_element": [
        {
          "type": "Unitised",
          "functional_unit": "m2",
          "quantity": 5.74,
          "element": [
            {
              "id": "13d22d3b-7fc6-4116-93ad-80c139e006dc",
              "type": "glazing",
              "quantity_unit": "m2",
              "quantity": 3.29,
              "material": [
                {
                  "type": "glass",
                  "impact_data_ID": "5726d14e-d36e-417d-afc4-c70793080186",
                  "quantity_unit": "m2/m2",
                  "quantity": 1
                }
              ]
            },
            {
              "id": "045d27e6-8397-4672-9f4a-6cbc5fe4e716",
              "type": "cladding",
              "quantity_unit": "m2",
              "quantity": 6.27,
              "material": [
                {
                  "type": "terracotta",
                  "impact_data_ID": "529d8876-6adb-449c-a12a-74c56aaadc4f",
                  "quantity_unit": "m/m2",
                  "quantity": 0.04
                },
                {
                  "type": "brick",
                  "impact_data_ID": "e28d29a9-38f8-4684-a6b1-0615ac7f66e5",
                  "quantity_unit": "m/m2",
                  "quantity": 0.06
                },
                {
                  "type": "GRC",
                  "impact_data_ID": "5043ffe6-9d2e-448e-83ed-f36f1f5decfc",
                  "quantity_unit": "m/m2",
                  "quantity": 0.025
                },
                {
                  "type": "Fiber cement",
                  "impact_data_ID": "53bbd2be-f9ac-4ee7-88f3-34df68ee5187",
                  "quantity_unit": "m/m2",
                  "quantity": 0.013
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

然后我加载了上面的文件并完成了以下操作:

test = pd.json_normalize(df['building_element_group'],
record_path= ['building_element', 'element', 'material'], 
meta = ['basetype', 
['building_element','quantity'], 
['building_element','type'], 
['building_element','element', 'quantity_unit'], 
['building_element','element', 'type']], 
errors='ignore', sep='-')

我想要做的是能够在每一行中显示所有 json 数据,因此所有嵌套数据。 我已经使用 Meta 来执行此操作,但我必须手动输入我需要的所有分支。 有没有办法做到这一点,所以我不需要手动执行此操作?

考虑与列表/字典理解争论不休,其中包括在每个级别合并字典,然后将结果传递给 DataFrame 构造函数:

import json
import pandas as pd

with open("BuildingElementMaterial.json") as f:
    data = json.load(f)

pd_data = [
   {
     **{f"group_{k}":v for k,v in g.items() if k != "building_element"},
     **{f"building_{k}":v for k,v in b.items() if k != "element"},
     **{f"element_{k}":v for k,v in e.items() if k != "material"},
     **{f"material_{k}":v for k,v in m.items()}
   }
   for g in data["building_element_group"]
   for b in g["building_element"]
   for e in b["element"]
   for m in e["material"]   
]

material_df = pd.DataFrame(pd_data)

Output

Dictionary

print(pd_data)
[
 {'building_functional_unit': 'm2',
  'building_quantity': 5.74,
  'building_type': 'Unitised',
  'element_id': '13d22d3b-7fc6-4116-93ad-80c139e006dc',
  'element_quantity': 3.29,
  'element_quantity_unit': 'm2',
  'element_type': 'glazing',
  'group_basetype': 'facade',
  'material_impact_data_ID': '5726d14e-d36e-417d-afc4-c70793080186',
  'material_quantity': 1,
  'material_quantity_unit': 'm2/m2',
  'material_type': 'glass'},
 {'building_functional_unit': 'm2',
  'building_quantity': 5.74,
  'building_type': 'Unitised',
  'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
  'element_quantity': 6.27,
  'element_quantity_unit': 'm2',
  'element_type': 'cladding',
  'group_basetype': 'facade',
  'material_impact_data_ID': '529d8876-6adb-449c-a12a-74c56aaadc4f',
  'material_quantity': 0.04,
  'material_quantity_unit': 'm/m2',
  'material_type': 'terracotta'},
 {'building_functional_unit': 'm2',
  'building_quantity': 5.74,
  'building_type': 'Unitised',
  'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
  'element_quantity': 6.27,
  'element_quantity_unit': 'm2',
  'element_type': 'cladding',
  'group_basetype': 'facade',
  'material_impact_data_ID': 'e28d29a9-38f8-4684-a6b1-0615ac7f66e5',
  'material_quantity': 0.06,
  'material_quantity_unit': 'm/m2',
  'material_type': 'brick'},
 {'building_functional_unit': 'm2',
  'building_quantity': 5.74,
  'building_type': 'Unitised',
  'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
  'element_quantity': 6.27,
  'element_quantity_unit': 'm2',
  'element_type': 'cladding',
  'group_basetype': 'facade',
  'material_impact_data_ID': '5043ffe6-9d2e-448e-83ed-f36f1f5decfc',
  'material_quantity': 0.025,
  'material_quantity_unit': 'm/m2',
  'material_type': 'GRC'},
 {'building_functional_unit': 'm2',
  'building_quantity': 5.74,
  'building_type': 'Unitised',
  'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
  'element_quantity': 6.27,
  'element_quantity_unit': 'm2',
  'element_type': 'cladding',
  'group_basetype': 'facade',
  'material_impact_data_ID': '53bbd2be-f9ac-4ee7-88f3-34df68ee5187',
  'material_quantity': 0.013,
  'material_quantity_unit': 'm/m2',
  'material_type': 'Fiber cement'}
]

DataFrame

print(material_df)

  group_basetype building_type building_functional_unit  ...               material_impact_data_ID material_quantity_unit material_quantity
0         facade      Unitised                       m2  ...  5726d14e-d36e-417d-afc4-c70793080186                  m2/m2             1.000
1         facade      Unitised                       m2  ...  529d8876-6adb-449c-a12a-74c56aaadc4f                   m/m2             0.040
2         facade      Unitised                       m2  ...  e28d29a9-38f8-4684-a6b1-0615ac7f66e5                   m/m2             0.060
3         facade      Unitised                       m2  ...  5043ffe6-9d2e-448e-83ed-f36f1f5decfc                   m/m2             0.025
4         facade      Unitised                       m2  ...  53bbd2be-f9ac-4ee7-88f3-34df68ee5187                   m/m2             0.013

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM