[英]CSV to structured nested JSON using python
I'm trying to convert flat structure csv into nested json structure.我正在尝试将平面结构 csv 转换为嵌套的 json 结构。
I have some data like:我有一些数据,例如:
State SubRegion Postcode Suburb
ACT South Canberra 2620 Oaks Estate
ACT North Canberra 2601 Acton
ACT North Canberra 2602 Ainslie
ACT Gungahlin-Hall 2914 Amaroo
I want desired output like this:我想要像这样的 output :
[
{
"name":"ACT",
"regions":[
{
"name":"South Canberra",
"suburbs":[
{
"postcode":"2620",
"name":"Oaks Estate"
}
]
},
{
"name":"North Canberra",
"suburbs":[
{
"postcode":"2601",
"name":"Acton"
},
{
"postcode":"2602",
"name":"Ainslie"
}
]
},
{
"name":"Gungahlin-Hall",
"suburbs":[
{
"postcode":"2914",
"name":"Amaroo"
}
]
}
]
}
]
I'm trying to get this structure using pandas and normal script but didn't get the correct structure yet.我正在尝试使用 pandas 和普通脚本来获得这个结构,但还没有得到正确的结构。
i think this should work我认为这应该有效
import csv
import json
def add_new_region(name, postcode, name2):
d = {"name" : name,
"suburbs" : [add_suburb(postcode, name2)]
}
return d
def add_suburb(postcode, name):
return {"postcode" : postcode,
"name" : name
}
datalist=[]
region_dict={}
region_dict_counter = 0
with open("data.csv", "r") as f:
data = csv.reader(f)
next(data) # skip headers
for row in data:
if row[0] in region_dict.keys():
for x in (datalist[region_dict[row[0]]])["regions"]:
if x["name"] == row[1]:
(x["suburbs"]).append(add_suburb(row[2], row[3]))
break
else :
datalist[region_dict[row[0]]]["regions"].append(add_new_region(row[1], row[2], row[3]))
else:
d = { "name" : row[0],
"regions" : [ add_new_region(row[1], row[2], row[3])]}
datalist.append(d)
region_dict[row[0]] = region_dict_counter
region_dict_counter+=1
json_data=json.dumps(datalist, indent=4)
print(json_data)
with open("data.json", "w") as j:
j.write(json_data)
I have solved this problem.我已经解决了这个问题。 Here is the solution:
这是解决方案:
def getindex(convertedList, value):
ivd = -1
for index, item in enumerate(convertedList):
# print("line 7 : ", item, value)
if item['name'] == value:
ivd = index
break
else:
ivd = -1
return ivd
with open('Regions.csv', 'r') as file:
reader = csv.reader(file)
mainData = []
loopIndex = 0
for row in reader:
if loopIndex > 0:
index = getindex(mainData, row[0])
if index > -1:
subindex = getindex(mainData[index]['regions'], row[1])
if subindex > -1:
suburbObj = {
'postcode' : row[3],
'name' : row[4]
}
mainData[index]['regions'][subindex]['suburbs'].append(suburbObj)
else :
regionObj = {
"name" : row[1],
"suburbs" : [{
"name" : row[4],
"postCode" : row[3]
}]
}
mainData[index]['regions'].append(regionObj)
else :
stateObj = {
'name' : row[0],
'regions' : [{
"name" : row[1],
"suburbs" : [{
"name" : row[4],
"postCode" : row[3]
}]
}]
}
mainData.append(stateObj)
loopIndex = loopIndex + 1
If anyone has any better-optimized code, you can post your solutions.如果有人有任何更好的优化代码,您可以发布您的解决方案。
Thanks谢谢
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.