[英]Python Pandas - Flatten Nested JSON
Working with Nested JSON data that I am trying to transform to a Pandas dataframe.使用我试图转换为 Pandas 数据框的嵌套 JSON 数据。 The json_normalize function offers a way to accomplish this.
json_normalize函数提供了一种方法来实现这一点。
{'locations': [{'accuracy': 17,
'activity': [{'activity': [{'confidence': 100,
'type': 'STILL'}],
'timestampMs': '1542652'}],
'altitude': -10,
'latitudeE7': 3777321,
'longitudeE7': -122423125,
'timestampMs': '1542654',
'verticalAccuracy': 2}]}
I utilized the function to normalize locations, however, the nested part 'activity' is not flat.我利用该功能来规范位置,但是,嵌套部分“活动”并不平坦。
Here's my attempt:这是我的尝试:
activity_data = json_normalize(d, 'locations', ['activity','type', 'confidence'],
meta_prefix='Prefix.',
errors='ignore')
DataFrame:数据框:
[{u'activity': [{u'confidence': 100, u'type': ... -10.0 NaN 377777377 -1224229340 1542652023196
The Activity column still has nested elements which I need unpacked in its own column. Activity 列仍然有嵌套元素,我需要将这些元素解压缩到它自己的列中。
Any suggestions/tips would be much appreciated.任何建议/提示将不胜感激。
dicts
dicts
_source_list
_source_list
def flatten_json(nested_json: dict, exclude: list=['']) -> dict:
"""
Flatten a list of nested dicts.
"""
out = dict()
def flatten(x: (list, dict, str), name: str='', exclude=exclude):
if type(x) is dict:
for a in x:
if a not in exclude:
flatten(x[a], f'{name}{a}_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, f'{name}{i}_')
i += 1
else:
out[name[:-1]] = x
flatten(nested_json)
return out
data
is a json
data
是一个json
data = {'locations': [{'accuracy': 17,'activity': [{'activity': [{'confidence': 100,'type': 'STILL'}],'timestampMs': '1542652'}],'altitude': -10,'latitudeE7': 3777321,'longitudeE7': -122423125,'timestampMs': '1542654','verticalAccuracy': 2},
{'accuracy': 17,'activity': [{'activity': [{'confidence': 100,'type': 'STILL'}],'timestampMs': '1542652'}],'altitude': -10,'latitudeE7': 3777321,'longitudeE7': -122423125,'timestampMs': '1542654','verticalAccuracy': 2},
{'accuracy': 17,'activity': [{'activity': [{'confidence': 100,'type': 'STILL'}],'timestampMs': '1542652'}],'altitude': -10,'latitudeE7': 3777321,'longitudeE7': -122423125,'timestampMs': '1542654','verticalAccuracy': 2}]}
flatten_json
:flatten_json
:df = pd.DataFrame([flatten_json(x) for x in data['locations']])
accuracy activity_0_activity_0_confidence activity_0_activity_0_type activity_0_timestampMs altitude latitudeE7 longitudeE7 timestampMs verticalAccuracy
17 100 STILL 1542652 -10 3777321 -122423125 1542654 2
17 100 STILL 1542652 -10 3777321 -122423125 1542654 2
17 100 STILL 1542652 -10 3777321 -122423125 1542654 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.