如何在python中将json树数据转换为dataframe？

Question

I have a json data which can be represented as the tree structure with each node has four attributes: name , id , child , parentid(pid) (for leaf node it has only three attributes: id , pid , name ). 我有一个json数据，可以表示为树结构，每个节点具有四个属性： name ， id ， child ， parentid(pid) （对于叶节点，它只有三个属性： id ， pid ， name ）。

{'child': [{'id': '','child':[{'id': '','child':['name':'','id':'','pid':''], 'name': '', 'pid':''}], 'name': '', 'pid': ''}],'name':'','pid':'','id':''}

I want to convert it to a dataframe with three columns like: 我想将其转换为具有三列的数据框，例如：

    id, pid, name
1   .., ..., ....
2   .., ..., ....

With the data from all layers in three attributes (id,pid,name) 来自所有层的数据具有三个属性(id,pid,name)

I have tried pandas.read_json with the default parameters but it seems that it cannot iterate the whole layers and the output is just like: 我已经尝试使用默认参数pandas.read_json ，但似乎它不能迭代整个图层，并且输出就像：

    id, pid, name, child
1   .., ..., ...., {'id':'','pid': '','name': '', 'child':[{...}]}
2   .., ..., ...., {'id':'','pid': '','name': '', 'child':[{...}]}

I am wondering whether there are some easy methods to solve this problem with or without pandas . 我想知道是否有一些简单的方法可以解决有或没有pandas问题。

Answer 1

I use a recursion to fulfill it and I have proved that it works on my data. 我使用了递归来实现它，并且证明了它对我的数据有效。

import json
import pandas as pd


def test_iterate(df):
    global total_data
    total_data = total_data.append(df[['id','pid','name']])
    try:
        df['child'].apply(lambda x:test_iterate(pd.DataFrame(x)))
    except Exception as inst:
        print(inst)
        pass

if __name__ == '__main__':
    total_data = pd.DataFrame()
    loaddata = json.load(open('test.json'))
    df = pd.DataFrame(loaddata)
    test_iterate(df)
    total_data.to_csv('test.csv',index=None)

如何在python中将json树数据转换为dataframe？

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-05-23 13:13:05

如何在python中将json树数据转换为dataframe？

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-05-23 13:13:05

解决方案1
0 已采纳 2017-05-23 13:13:05