简体   繁体   中英

Transforming nested JSON coming from Api into dataframe with Python

This is Json coming from Api put into list:

final = [{
"email":"hildamicalay@yahoo.es"
"createdAt":"2019-10-14T15:22:35.188-05:00" 
"statistics":{
      "clicked":[
         0:{
            "campaignId":415
            "links":[
               0:{
                  "count":1
                  "eventTime":"2019-10-17T17:29:16.551-05:00"
                 }
            ]
         }
         1:{...}
         2:{...}
         3:{...}
      ]
...
}]

I want to make a Dataframe with this, the desired output is:

count campaignId   eventTime                    email                       createdAt    status
0   1        415  2019-10-17    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
1   1        415  2020-06-15    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
2   1        415  2020-06-15    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
3   1        415  2020-06-15    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
4   2        415  2020-10-14    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
5   1        415  2020-10-14    hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked

What I have tried is this:

import json
import pandas as pd

df3 = pd.json_normalize(final,['statistics','clicked','links'],['email','createdAt'],errors='ignore')
df3['status'] = 'clicked'

But I only getting this. I try to get campaignId in dataframe, any help would be appreciated.

 count                      eventTime                   email                       createdAt    status
0   1   2019-10-17T17:29:16.551-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
1   1   2020-06-15T18:27:33.179-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
2   1   2020-06-15T18:21:32.942-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
3   1   2020-06-15T18:22:46.963-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
4   2   2020-10-14T18:23:18.949-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked
5   1   2020-10-14T18:25:42.373-05:00   hildamicalay@yahoo.es   2019-10-14T15:22:35.188-05:00   clicked

you want to change eventTime to date?

import datetime as dt
df3['eventTime'] =  pd.to_datetime(df['eventTime'])
df['eventTime'] = df['eventTime'].dt.date

Solved!

df3 = pd.json_normalize(final,['statistics','clicked'],['email','createdAt'],errors='ignore')
df3 = df3.explode('links')
df3 = pd.concat([df3 ,df3['links'].apply(pd.Series)],axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM