简体   繁体   English

使用 Python 将来自 Api 的嵌套 JSON 转换为数据帧

[英]Transforming nested JSON coming from Api into dataframe with Python

This is Json coming from Api put into list:这是来自 Api 的 Json 放入列表中:

final = [{
"email":"hildamicalay@yahoo.es"
"createdAt":"2019-10-14T15:22:35.188-05:00" 
"statistics":{
      "clicked":[
         0:{
            "campaignId":415
            "links":[
               0:{
                  "count":1
                  "eventTime":"2019-10-17T17:29:16.551-05:00"
                 }
            ]
         }
         1:{...}
         2:{...}
         3:{...}
      ]
...
}]

I want to make a Dataframe with this, the desired output is:我想用这个制作一个数据框,所需的输出是:

count campaignId   eventTime                    email                       createdAt    status
0   1        415  2019-10-17    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
1   1        415  2020-06-15    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
2   1        415  2020-06-15    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
3   1        415  2020-06-15    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
4   2        415  2020-10-14    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
5   1        415  2020-10-14    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked

What I have tried is this:我试过的是这样的:

import json
import pandas as pd

df3 = pd.json_normalize(final,['statistics','clicked','links'],['email','createdAt'],errors='ignore')
df3['status'] = 'clicked'

But I only getting this.但我只得到这个。 I try to get campaignId in dataframe, any help would be appreciated.我尝试在数据框中获取campaignId ,任何帮助将不胜感激。

 count                      eventTime                   email                       createdAt    status
0   1   2019-10-17T17:29:16.551-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
1   1   2020-06-15T18:27:33.179-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
2   1   2020-06-15T18:21:32.942-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
3   1   2020-06-15T18:22:46.963-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
4   2   2020-10-14T18:23:18.949-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
5   1   2020-10-14T18:25:42.373-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked

you want to change eventTime to date?您想将 eventTime 更改为日期吗?

import datetime as dt
df3['eventTime'] =  pd.to_datetime(df['eventTime'])
df['eventTime'] = df['eventTime'].dt.date

Solved!解决了!

df3 = pd.json_normalize(final,['statistics','clicked'],['email','createdAt'],errors='ignore')
df3 = df3.explode('links')
df3 = pd.concat([df3 ,df3['links'].apply(pd.Series)],axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM