简体   繁体   English

从嵌套的json列表中展平Pandas DataFrame

[英]Flatten Pandas DataFrame from nested json list

perhaps somebody could help me. 也许有人可以帮助我。 I tried to flat the following ist into a pandas dataframe: 我试图将以下ist放到pandas数据框中:

[{u'_id': u'2',
  u'_index': u'list',
  u'_score': 1.4142135,
  u'_source': {u'name': u'name3'},
  u'_type': u'doc'},
 {u'_id': u'5',
  u'_index': u'list',
  u'_score': 1.4142135,
  u'_source': {u'dat': u'2016-12-12', u'name': u'name2'},
  u'_type': u'doc'},
 {u'_id': u'1',
  u'_index': u'list',
  u'_score': 1.4142135,
  u'_source': {u'name': u'name1'},
  u'_type': u'doc'}]

The result should look like: 结果应如下所示:

|_id   | _index | _score | name | dat        | _type |
------------------------------------------------------
|1     |list    |1.4142..| name1| nan        | doc   |
|2     |list    |1.4142..| name3| nan        | doc   |
|3     |list    |1.4142..| name1| 2016-12-12 | doc   |

But all I tried to do is not possible to get the desired result. 但是我尝试做的所有事情都无法获得理想的结果。 I used something like this: 我用了这样的东西:

df = pd.concat(map(pd.DataFrame.from_dict, res['hits']['hits']), axis=1)['_source'].T

But then I loose the types wich is outside the _source field. 但是然后我松开了_source字段之外的类型。 I also tried to work with 我也尝试与

test = pd.DataFrame(list)
for index, row in test.iterrows():
  test.loc[index,'d'] = 

But I have no idea how to come to the point to use the field _source and append it to the original data frame. 但是我不知道如何使用字段_source并将其附加到原始数据帧。

Did somebody has an idea how to to that and become the desired outcome? 有人知道如何做到这一点并达到预期的结果吗?

Use json_normalize : 使用json_normalize

from pandas.io.json import json_normalize  

df = json_normalize(data)
print (df)
  _id _index    _score _source.dat _source.name _type
0   2   list  1.414214         NaN        name3   doc
1   5   list  1.414214  2016-12-12        name2   doc
2   1   list  1.414214         NaN        name1   doc

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM