[英]Flatten Json with Pandas (multiple lists)
我想使用 json 文件返回 pandas dataframe,其中每一行都列出了所有数据。 json 文件如下所示。
{
"building_element_group": [
{
"basetype": "facade",
"building_element": [
{
"type": "Unitised",
"functional_unit": "m2",
"quantity": 5.74,
"element": [
{
"id": "13d22d3b-7fc6-4116-93ad-80c139e006dc",
"type": "glazing",
"quantity_unit": "m2",
"quantity": 3.29,
"material": [
{
"type": "glass",
"impact_data_ID": "5726d14e-d36e-417d-afc4-c70793080186",
"quantity_unit": "m2/m2",
"quantity": 1
}
]
},
{
"id": "045d27e6-8397-4672-9f4a-6cbc5fe4e716",
"type": "cladding",
"quantity_unit": "m2",
"quantity": 6.27,
"material": [
{
"type": "terracotta",
"impact_data_ID": "529d8876-6adb-449c-a12a-74c56aaadc4f",
"quantity_unit": "m/m2",
"quantity": 0.04
},
{
"type": "brick",
"impact_data_ID": "e28d29a9-38f8-4684-a6b1-0615ac7f66e5",
"quantity_unit": "m/m2",
"quantity": 0.06
},
{
"type": "GRC",
"impact_data_ID": "5043ffe6-9d2e-448e-83ed-f36f1f5decfc",
"quantity_unit": "m/m2",
"quantity": 0.025
},
{
"type": "Fiber cement",
"impact_data_ID": "53bbd2be-f9ac-4ee7-88f3-34df68ee5187",
"quantity_unit": "m/m2",
"quantity": 0.013
}
]
}
]
}
]
}
]
}
然后我加载了上面的文件并完成了以下操作:
test = pd.json_normalize(df['building_element_group'],
record_path= ['building_element', 'element', 'material'],
meta = ['basetype',
['building_element','quantity'],
['building_element','type'],
['building_element','element', 'quantity_unit'],
['building_element','element', 'type']],
errors='ignore', sep='-')
我想要做的是能够在每一行中显示所有 json 数据,因此所有嵌套数据。 我已经使用 Meta 来执行此操作,但我必须手动输入我需要的所有分支。 有没有办法做到这一点,所以我不需要手动执行此操作?
考虑与列表/字典理解争论不休,其中包括在每个级别合并字典,然后将结果传递给 DataFrame 构造函数:
import json
import pandas as pd
with open("BuildingElementMaterial.json") as f:
data = json.load(f)
pd_data = [
{
**{f"group_{k}":v for k,v in g.items() if k != "building_element"},
**{f"building_{k}":v for k,v in b.items() if k != "element"},
**{f"element_{k}":v for k,v in e.items() if k != "material"},
**{f"material_{k}":v for k,v in m.items()}
}
for g in data["building_element_group"]
for b in g["building_element"]
for e in b["element"]
for m in e["material"]
]
material_df = pd.DataFrame(pd_data)
Output
Dictionary
print(pd_data)
[
{'building_functional_unit': 'm2',
'building_quantity': 5.74,
'building_type': 'Unitised',
'element_id': '13d22d3b-7fc6-4116-93ad-80c139e006dc',
'element_quantity': 3.29,
'element_quantity_unit': 'm2',
'element_type': 'glazing',
'group_basetype': 'facade',
'material_impact_data_ID': '5726d14e-d36e-417d-afc4-c70793080186',
'material_quantity': 1,
'material_quantity_unit': 'm2/m2',
'material_type': 'glass'},
{'building_functional_unit': 'm2',
'building_quantity': 5.74,
'building_type': 'Unitised',
'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
'element_quantity': 6.27,
'element_quantity_unit': 'm2',
'element_type': 'cladding',
'group_basetype': 'facade',
'material_impact_data_ID': '529d8876-6adb-449c-a12a-74c56aaadc4f',
'material_quantity': 0.04,
'material_quantity_unit': 'm/m2',
'material_type': 'terracotta'},
{'building_functional_unit': 'm2',
'building_quantity': 5.74,
'building_type': 'Unitised',
'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
'element_quantity': 6.27,
'element_quantity_unit': 'm2',
'element_type': 'cladding',
'group_basetype': 'facade',
'material_impact_data_ID': 'e28d29a9-38f8-4684-a6b1-0615ac7f66e5',
'material_quantity': 0.06,
'material_quantity_unit': 'm/m2',
'material_type': 'brick'},
{'building_functional_unit': 'm2',
'building_quantity': 5.74,
'building_type': 'Unitised',
'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
'element_quantity': 6.27,
'element_quantity_unit': 'm2',
'element_type': 'cladding',
'group_basetype': 'facade',
'material_impact_data_ID': '5043ffe6-9d2e-448e-83ed-f36f1f5decfc',
'material_quantity': 0.025,
'material_quantity_unit': 'm/m2',
'material_type': 'GRC'},
{'building_functional_unit': 'm2',
'building_quantity': 5.74,
'building_type': 'Unitised',
'element_id': '045d27e6-8397-4672-9f4a-6cbc5fe4e716',
'element_quantity': 6.27,
'element_quantity_unit': 'm2',
'element_type': 'cladding',
'group_basetype': 'facade',
'material_impact_data_ID': '53bbd2be-f9ac-4ee7-88f3-34df68ee5187',
'material_quantity': 0.013,
'material_quantity_unit': 'm/m2',
'material_type': 'Fiber cement'}
]
DataFrame
print(material_df)
group_basetype building_type building_functional_unit ... material_impact_data_ID material_quantity_unit material_quantity
0 facade Unitised m2 ... 5726d14e-d36e-417d-afc4-c70793080186 m2/m2 1.000
1 facade Unitised m2 ... 529d8876-6adb-449c-a12a-74c56aaadc4f m/m2 0.040
2 facade Unitised m2 ... e28d29a9-38f8-4684-a6b1-0615ac7f66e5 m/m2 0.060
3 facade Unitised m2 ... 5043ffe6-9d2e-448e-83ed-f36f1f5decfc m/m2 0.025
4 facade Unitised m2 ... 53bbd2be-f9ac-4ee7-88f3-34df68ee5187 m/m2 0.013
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.