[英]How to convert a json tree data into dataframe in Python?
I have a json data which can be represented as the tree structure with each node has four attributes: name
, id
, child
, parentid(pid)
(for leaf node it has only three attributes: id
, pid
, name
). 我有一个json数据,可以表示为树结构,每个节点具有四个属性:
name
, id
, child
, parentid(pid)
(对于叶节点,它只有三个属性: id
, pid
, name
)。
{'child': [{'id': '','child':[{'id': '','child':['name':'','id':'','pid':''], 'name': '', 'pid':''}], 'name': '', 'pid': ''}],'name':'','pid':'','id':''}
I want to convert it to a dataframe with three columns like: 我想将其转换为具有三列的数据框,例如:
id, pid, name
1 .., ..., ....
2 .., ..., ....
With the data from all layers in three attributes (id,pid,name)
来自所有层的数据具有三个属性
(id,pid,name)
I have tried pandas.read_json
with the default parameters but it seems that it cannot iterate the whole layers and the output is just like: 我已经尝试使用默认参数
pandas.read_json
,但似乎它不能迭代整个图层,并且输出就像:
id, pid, name, child
1 .., ..., ...., {'id':'','pid': '','name': '', 'child':[{...}]}
2 .., ..., ...., {'id':'','pid': '','name': '', 'child':[{...}]}
I am wondering whether there are some easy methods to solve this problem with or without pandas
. 我想知道是否有一些简单的方法可以解决有或没有
pandas
问题。
I use a recursion to fulfill it and I have proved that it works on my data. 我使用了递归来实现它,并且证明了它对我的数据有效。
import json
import pandas as pd
def test_iterate(df):
global total_data
total_data = total_data.append(df[['id','pid','name']])
try:
df['child'].apply(lambda x:test_iterate(pd.DataFrame(x)))
except Exception as inst:
print(inst)
pass
if __name__ == '__main__':
total_data = pd.DataFrame()
loaddata = json.load(open('test.json'))
df = pd.DataFrame(loaddata)
test_iterate(df)
total_data.to_csv('test.csv',index=None)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.