![](/img/trans.png)
[英]PySpark problem flattening array with nested JSON and other elements
[英]Excel to nested Json including child elements into array
我正在嘗試使用 Python 將 Excel 轉換為嵌套 JSON,其中重復值作為元素數組進入。
例如:CSV 的結構
Manufacturer,oilType,viscosity
shell,superOil,1ova
shell,superOil,2ova
shell,normalOil,1ova
bp, power, 10bba
應以 JSON(預期輸出)顯示為
elements: [
{
"Manufacturer": "shell",
"details": [
{
"OilType": "superOil",
"Viscosity": [
"1ova",
"2ova"
]
},
{
"OilType": "normalOil",
"Viscosity": [
"1ova"
]
}
]
},
{
"Manufacturer": "bp",
"details": [
{
"OilType": "power",
"Viscosity": [
"10bba"
]
}
]
}
]
我目前已使用openpyxl
將 CSV 轉換為 JSON,並以(當前輸出)等格式顯示每個標題的值
[{Manufacturer: "shell", oilType: "superOil", Viscosity:"1ova"},{...},{...},...]
請幫助獲得預期的輸出。
您好,歡迎來到 StackOverflow。
您的問題實際上與openpyxl
無關,因為您不需要保存到 Excel 文件中。
你可以這樣想:
DataFrame
在實踐中,這給出了類似的東西:
import json
import pandas as pd
df = pd.read_csv("oil.csv") # or read_excel if this is an Excel
oils = df.groupby(["Manufacturer", "oilType"]).aggregate(pd.Series.to_list)
elements = [
{
"Manufacturer": manufacturer,
"Details": [
{"OilType": o, "Viscosity": v}
for o, v in data.droplevel(0).viscosity.items()
],
}
for manufacturer, data in oils.groupby(level="Manufacturer")
]
with open("oil.json", "w") as f:
json.dump({"elements": elements}, f)
有關信息, oils
看起來像這樣:
viscosity
Manufacturer oilType
bp power [10bba]
shell normalOil [1ova]
superOil [1ova, 2ova]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.