简体   繁体   English

从flat csv创建嵌套的JSON

[英]Create nested JSON from flat csv

Trying to create a 4 deep nested JSON from a csv based upon this example: 尝试基于此示例从csv创建4个深层嵌套JSON:

Region,Company,Department,Expense,Cost
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123

At each level I would like to sum the costs and include it in the outputted JSON at the relevant level. 在每个级别,我想总结成本并将其包含在相关级别的输出JSON中。

The structure of the outputted JSON should look something like this: 输出的JSON的结构应如下所示:

    {
          "id": "aUniqueIdentifier", 
          "name": "usually a nodes name", 
          "data": [
                {
                      "key": "some key", 
                      "value": "some value"
                }, 
                {
                      "key": "some other key", 
                      "value": "some other value"
                }
          ], 
          "children": [/* other nodes or empty */ ]
    }

(REF: http://blog.thejit.org/2008/04/27/feeding-json-tree-structures-to-the-jit/ ) (参考: http//blog.thejit.org/2008/04/27/feeding-json-tree-structures-to-the-jit/

Thinking along the lines of a recursive function in python but have not had much success with this approach so far... any suggestions for a quick and easy solution greatly appreciated? 按照python中递归函数的思路思考但到目前为止这种方法还没有取得多大成功...对快速简便解决方案的任何建议都非常感激?

UPDATE: Gradually giving up on the idea of the summarised costs because I just can't figure it out :(. I'not much of a python coder yet)! 更新:逐渐放弃总结成本的想法,因为我无法弄明白:(。我还不是一个python编码器)! Simply being able to generate the formatted JSON would be good enough and I can plug in the numbers later if I have to. 只需能够生成格式化的JSON就足够了,如果必须的话,我可以稍后插入数字。

Have been reading, googling and reading for a solution and on the way have learnt a lot but still no success in creating my nested JSON files from the above CSV strucutre. 一直在阅读,谷歌搜索和阅读解决方案,并在途中学到了很多,但仍然没有成功从上面的CSV结构创建我的嵌套JSON文件。 Must be a simple solution somewhere on the web? 网上某处必须是一个简单的解决方案吗? Maybe somebody else has had more luck with their search terms???? 也许其他人的搜索条件更幸运了????

Here are some hints. 这里有一些提示。

Parse the input to a list of lists with csv.reader : 使用csv.reader将输入解析为列表列表:

>>> rows = list(csv.reader(source.splitlines()))

Loop over the list to buildi up your dictionary and summarize the costs. 循环遍历列表以构建字典并总结成本。 Depending on the structure you're looking to create the build-up might look something like this: 根据您要创建的结构,构建可能如下所示:

>>> summary = []
>>> for region, company, department, expense, cost in rows[1:]:
    summary.setdefault(*region, company, department), []).append((expense, cost))

Write the result out with json.dump : json.dump写出结果:

>>> json.dump(summary, open('dest.json', 'wb'))

Hopefully, the recursive function below will help get you started. 希望下面的递归函数可以帮助您入门。 It builds a tree from the input. 它根据输入构建树。 Please be aware of what type you want your leaves to be in, which we label as the "cost". 请注意您希望叶子的类型,我们将其标记为“成本”。 You'll need to elaborate on the function to build-up the exact structure you intend: 你需要详细说明这个函数来建立你想要的确切结构:

import csv, itertools, json

def cluster(rows):
    result = []
    for key, group in itertools.groupby(rows, key=lambda r: r[0]):
        group_rows = [row[1:] for row in group]
        if len(group_rows[0]) == 2:
            result.append({key: dict(group_rows)})
        else:
            result.append({key: cluster(group_rows)})
    return result

if __name__ == '__main__':
    s = '''\
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
'''
    rows = list(csv.reader(s.splitlines()))
    r = cluster(rows)
    print json.dumps(r, indent=4)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM