简体   繁体   中英

Not able to extract data into Pandas dataframe in correct format

I am trying to extract data from API and write into a Pandas Dataframe so that I can do some transformations.

import requests

headers = {
    'Authorization': 'Api-Key',
}

params = (
    ('locodes', 'PLWRO,DEHAM'),
)

response = requests.get('https://api.xxx.com/weather/v1/forecasts', headers=headers, params=params)

The result of the API Call

response.text

'{"results":[{"place":{"type":"locode","value":"PLWRO"},"measures":[{"ts":1571896800000,"t2m":10.72,"t_min":10.53,"t_max":11.99,"wspd":8,"dir":"SE","wgust":12,"rh2m":87,"prsmsl":1012,"skcover":"clear","precip":0.0,"snowd":0,"thunderstorm":"N","fog":"H"}]},{"place":{"type":"locode","value":"DEHAM"},"measures":[{"ts":1571896800000,"t2m":10.79,"t_min":10.3,"t_max":10.9,"wspd":13,"dir":"ESE","wgust":31,"rh2m":97,"prsmsl":1008,"skcover":"partly_cloudy","precip":0.0,"snowd":0,"thunderstorm":"N","fog":"H"}]}]}'

When Try to into pandas dataframe its not coming in the correct format.

import pandas as pd
import io
urlData = response.content
rawData = pd.read_csv(io.StringIO(urlData.decode('utf-8')))

Current Output 在此处输入图像描述

How can I have values populating correctly under each header.

Expected format在此处输入图像描述

First convert json to dictionaries, then is necessary some processing for add locode to measures , merge dictionaries, append them to list and last call DataFrame constructor:

import json

d = json.loads(response.text)

out = []
for x in d['results']:
    t = x['place']['type']
    v = x['place']['value']
    for y in x['measures']:
        y = {**{t:v}, **y}
        out.append(y)
#print (out)

df = pd.DataFrame(out)
print (df)
  locode             ts    t2m  t_min  t_max  wspd  dir  wgust  rh2m  prsmsl  \
0  PLWRO  1571896800000  10.72  10.53  11.99     8   SE     12    87    1012   
1  DEHAM  1571896800000  10.79  10.30  10.90    13  ESE     31    97    1008   

         skcover  precip  snowd thunderstorm fog  
0          clear     0.0      0            N   H  
1  partly_cloudy     0.0      0            N   H  

you can process use module Abstract Syntax Tree (import ast) to convert string to Python dictionary. You can read more about a user-case of ast at this StackOverflow post

In your case I would do:

import ast
import pandas as pd

response = '{"results":[{"place":{"type":"locode","value":"PLWRO"},"measures":[{"ts":1571896800000,"t2m":10.72,"t_min":10.53,"t_max":11.99,"wspd":8,"dir":"SE","wgust":12,"rh2m":87,"prsmsl":1012,"skcover":"clear","precip":0.0,"snowd":0,"thunderstorm":"N","fog":"H"}]},{"place":{"type":"locode","value":"DEHAM"},"measures":[{"ts":1571896800000,"t2m":10.79,"t_min":10.3,"t_max":10.9,"wspd":13,"dir":"ESE","wgust":31,"rh2m":97,"prsmsl":1008,"skcover":"partly_cloudy","precip":0.0,"snowd":0,"thunderstorm":"N","fog":"H"}]}]}'

# convert response to python dict
response_to_dict = ast.literal_eval(response)

# convert response_to_dict into pandas DataFrame
df = pd.DataFrame(response_to_dict['results'][0]['measures'])

Output:

|---|---|------|------|----|---|-----|------------|----|-----|----|
|dir|fog|precip|prsmsl|rh2m|...|t_min|thunderstorm|ts  |wgust|wspd|
|SE |H  |0.0   |1012  | 87 |...|10.53|N           |15..|12   | 8  |
|---|---|------|------|----|---|-----|------------|----|-----|----|

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM