简体   繁体   English

熊猫 json_normalize KeyError

[英]pandas json_normalize KeyError

I have a nested json file that doesn't have a unified structure, like the following sample:我有一个没有统一结构的嵌套 json 文件,如下例所示:

[{ "name": "Jon", "last": "Jonny"}, 
 {"name": "Jimmy", "last": "johnson", "kids":[{"kidName":"johnson_junior","kidAge": "1"}, {"kidName":"johnson_junior2", "kidAge": "4"}]}]

See that in the second item there is list name "kids" that doesn't exists in the first item.看到在第二项中有第一项中不存在的列表名称“kids”。

When i tries to flat the json with pandas json_normalize it Throws out error message: "KeyError: 'kids'"当我尝试用 pandas json_normalize 扁平化 json 时抛出错误消息:“KeyError:'kids'”

This is the json_normalize command:这是 json_normalize 命令:

flat_json = json_normalize(json_file, record_path= 'kids',  errors='ignore')

it seems that json_normalize doesn't support nested json that doesn't have unified structure. json_normalize 似乎不支持没有统一结构的嵌套 json。

Has someone experienced the same issue?有人遇到过同样的问题吗? Do you have an idea on how to get through it?你知道如何度过难关吗?

If it is not much trouble, I would add 'kids':[{'kidName':None,'kidAge':None}] whenever that key is not present.如果不是很麻烦,只要该键不存在,我就会添加'kids':[{'kidName':None,'kidAge':None}]

errors='ignore' is used for keys listed in meta (see docu ) whereas what you are specifying with kids is a record path . errors='ignore'用于meta中列出的键(请参阅docu ),而您对kids指定的是记录路径

I don't know if you were asking for general advice as in "what happens if the record path key sometimes is not available?", but just in case the data example you provide is your current problem, that's the quick fix I would propose.我不知道您是否在寻求一般性建议,如“如果记录路径密钥有时不可用会发生什么?”,但以防万一您提供的数据示例是您当前的问题,这就是我建议的快速修复.

Something like this works:这样的事情有效:

data = {"name": "Jimmy", "last": "johnson", "kids":[{"kidName":"johnson_junior","kidAge": "1"}, {"kidName":"johnson_junior2", "kidAge": "4"}]}]

# then you inform with empty kids if looping doesn't alter your desired flow that much
[elem.update({'kids':[{'kidName':None,'kidAge':None}]}) for elem in data if 'kids' not in elem.keys()]

# finally you normalize
flat_json = json_normalize(data,'kids', ['name','last'])

The output:输出:

kidAge          kidName   name     last
0   None             None    Jon    Jonny
1      1   johnson_junior  Jimmy  johnson
2      4  johnson_junior2  Jimmy  johnson

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM