繁体   English   中英

在Python中将CSV转换为结构良好的JSON

[英]Convert CSV to well-structured JSON in Python

我有一个结构如下的CSV文件:

Store, Region, District, MallName, Location

1234,90,910,MallA,GMT
4567,87,902,MallB,EST
2468,90,811,MallC,PST
1357,87,902,MallD,CST

我反复眉头敲打能够完成的工作是得到如下格式:

{
  "90": {
    "910": {
      "1234": {
        "name": "MallA",
        "location": "GMT"
      }
    },
    "811": {
      "2468": {
        "name": "MallB",
        "location": "PST"
      }
    }
  },
  "87": {
    "902": {
      "4567": {
        "name": "MallB",
        "location": "EST"
      },
      "1357": {
        "name": "MallD",
        "location": "CST"
      }
    }
  }
}

下面的代码被精简以匹配我提供的样本数据集,但是您可以了解正在发生的事情。 再次强调,这是非常迭代且非Python的,我正努力朝着这一方向发展。 (如果有人觉得定义的程序值得发布,我可以)。

#************
#   Main()
#************
dictHierarchy = {}

with open(getFilePath(), 'r') as f:
    content = [line.strip('\n') for line in f.readlines()]

for data in content:
    data = data.split(",")

    myRegion = data[1]
    myDistrict = data[2]
    myName = data[3]
    myLocation = data[4]
    myStore = data[0]

    if myRegion in dictHierarchy:
        #check for District
        if myDistrict in dictHierarchy[myRegion]:
            #checkforStore
            dictHierarchy[myRegion][myDistrict].update({myStore:addStoreDetails(data)})
        else:
            #add district
            dictHierarchy[myRegion].update({myDistrict:addStore(data)}) 
    else:
        #add region
        dictHierarchy.update({myRegion:addDistrict(data)})

with open('hierarchy.json', 'w') as outfile:
    json.dump(dictHierarchy, outfile)

强迫性的我看着上面的JSON输出,并认为对于盲目打开文件的人来说,它看起来像是杂物箱。 为了实现纯文本可读性,我希望对数据进行分组,然后将其放入JSON格式:

{"Regions":[
    {"Region":"90", "Districts":[
        {"District":"910", "Stores":[
            {"Store":"1234", "name":"MallA", "location":"GMT"}]},
        {"District":"811", "Stores":[
            {"Store":"2468", "name":"MallC", "location":"PST"}]}]},
    {"Region":"87", "Districts":[
        {"District":"902", "Stores":[
            {"Store":"4567", "name":"MallB", "location":"EST"},
            {"Store":"1357", "name":"MallD", "location":"CST"}]}]}]}

长话短说,今天我浪费了很多时间试图弄清楚如何在Python中实际填充数据结构,但实际上根本没有结果。 有没有一种干净的,pythonic的方法来实现这一目标? 值得付出努力吗?

我已经将标题添加到您的输入中,例如:

Store,Region,District,name,location
1234,90,910,MallA,GMT
4567,87,902,MallB,EST
2468,90,811,MallC,PST
1357,87,902,MallD,CST

然后像这样使用python csv阅读器group by

import csv
from itertools import groupby, ifilter
from operator import itemgetter

data = []

with open('in.csv') as csvfile:
    reader = csv.DictReader(csvfile)

    regions = []

    regions_dict = sorted(list(reader), key=itemgetter('Region'))
    for region_id, region_group in groupby(regions_dict, itemgetter('Region')):

        districts = []
        regions.append({'Region': region_id, 'Districts': districts})

        districts_dict = sorted(region_group, key=itemgetter('District'))
        for district_id, district_group in groupby(districts_dict, itemgetter('District')):
            districts.append({'District': district_id, 'Stores': list(district_group)})

print regions

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM