[英]How to convert nested JSON data to CSV using python?
我有一个包含超过5000个对象的数组的文件。 但是,我无法将我的JSON文件的某个特定部分转换为CSV格式的相应列。
以下是我的数据文件的示例版本:
{
"Result": {
"Example 1": {
"Type1": [
{
"Owner": "Name1 Example",
"Description": "Description1 Example",
"Email": "example1_email@email.com",
"Phone": "(123) 456-7890"
}
]
},
"Example 2": {
"Type1": [
{
"Owner": "Name2 Example",
"Description": "Description2 Example",
"Email": "example2_email@email.com",
"Phone": "(111) 222-3333"
}
]
}
}
}
这是我目前的代码:
import csv
import json
json_file='example.json'
with open(json_file, 'r') as json_data:
x = json.load(json_data)
f = csv.writer(open("example.csv", "w"))
f.writerow(["Address","Type","Owner","Description","Email","Phone"])
for key in x["Result"]:
type = "Type1"
f.writerow([key,
type,
x["Result"][key]["Type1"]["Owner"],
x["Result"][key]["Type1"]["Description"],
x["Result"][key]["Type1"]["Email"],
x["Result"][key]["Type1"]["Phone"]])
我的问题是我遇到了这个问题:
Traceback (most recent call last):
File "./convert.py", line 18, in <module>
x["Result"][key]["Type1"]["Owner"],
TypeError: list indices must be integers or slices, not str
当我尝试将最后一个数组(如“Owner”)替换为整数值时,我收到此错误: IndexError: list index out of range
。
当我严格改变f.writerow函数时
f.writerow([key,
type,
x["Result"][key]["Type1"]])
我在一列中收到结果,但它将所有内容合并为一列,这是有道理的。 输出图片: https : //imgur.com/a/JpDkaAT
我希望将结果基于标签分成单独的列,而不是合并为一个。 有人可以帮忙吗?
谢谢!
数据结构中的Type1
是列表,而不是dict。 因此,您需要迭代它而不是按键引用。
for key in x["Result"]:
# key is now "Example 1" etc.
type1 = x["Result"][key]["Type1"]
# type1 is a list, not a dict
for i in type1:
f.writerow([key,
"Type1",
type1["Owner"],
type1["Description"],
type1["Email"],
type1["Phone"]])
内部for循环确保您不受“Type1”在列表中只有一个项目的假设的保护。
它绝对不是最好的例子,但我很困难来优化它。
import csv
def json_to_csv(obj, res):
for k, v in obj.items():
if isinstance(v, dict):
res.append(k)
json_to_csv(v, res)
elif isinstance(v, list):
res.append(k)
for el in v:
json_to_csv(el, res)
else:
res.append(v)
obj = {
"Result": {
"Example 1": {
"Type1": [
{
"Owner": "Name1 Example",
"Description": "Description1 Example",
"Email": "example1_email@email.com",
"Phone": "(123) 456-7890"
}
]
},
"Example 2": {
"Type1": [
{
"Owner": "Name2 Example",
"Description": "Description2 Example",
"Email": "example2_email@email.com",
"Phone": "(111) 222-3333"
}
]
}
}
}
with open("out.csv", "w+") as f:
writer = csv.writer(f)
writer.writerow(["Address","Type","Owner","Description","Email","Phone"])
for k, v in obj["Result"].items():
row = [k]
json_to_csv(v, row)
writer.writerow(row)
弄清楚了!
我将f.writerow函数更改为以下内容:
for key in x["Result"]:
type = "Type1"
f.writerow([key,
type,
x["Result"][key]["Type1"][0]["Owner"],
x["Result"][key]["Type1"][0]["Email"]])
...
这允许我引用对象内的键。 希望这可以帮助有人下线!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.