简体   繁体   中英

I can't correctly visualize a json dataframe from api

I am currently trying to read some data from a public API. It has different ways of reading (json, csv, txt, among others), just change the label in the url (/ json, / csv, / txt...). The url is as follows:

https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/ https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/json/ ...

My problem is that when trying to import into the Pandas dataframe it doesn't read the data correctly. I am trying the following alternatives:

import pandas as pd
import requests

url = 'https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/json/'
r = requests.get(url)

rjson = r.json()

df= json_normalize(rjson)
df['periods']

在此处输入图像描述

Also I try to read the data in csv format:

import pandas as pd
import requests

url = 'https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/'

collisions = pd.read_csv(url, sep='<br>')
collisions.head()

But I don't get good results; the dataframe cannot be visualized correctly since the 'periods' column is grouped with all the values...

the output is displayed as follows:

在此处输入图像描述

all data appears as columns: /

Here is an example of how the data is displayed correctly:

在此处输入图像描述

What alternative do you recommend trying?

Thank you in advance for your time and help !!

I will be attentive to your answers, regards!

For csv you can use StringIO from io package

In [20]: import requests

In [21]: res = requests.get("https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/")

In [22]: import pandas as pd

In [23]: import io

In [24]: df = pd.read_csv(io.StringIO(res.text.strip().replace("<br>","\n")), engine='python')

In [25]: df
Out[25]:
   Mes/A&ntilde;o  Tipo de cambio - promedio del periodo (S/ por US$) - Bancario - Promedio
0        Jul.2018                                           3.276595
1        Ago.2018                                           3.288071
2        Sep.2018                                           3.311325
3        Oct.2018                                           3.333909
4        Nov.2018                                           3.374675
5        Dic.2018                                           3.364026
6        Ene.2019                                           3.343864
7        Feb.2019                                           3.321475
8        Mar.2019                                           3.304690
9        Abr.2019                                           3.303825
10       May.2019                                           3.332364
11       Jun.2019                                           3.325650
12       Jul.2019                                           3.290214
13       Ago.2019                                           3.377560
14       Sep.2019                                           3.357357
15       Oct.2019                                           3.359762
16       Nov.2019                                           3.371700
17       Dic.2019                                           3.355190
18       Ene.2020                                           3.327364
19       Feb.2020                                           3.390350
20       Mar.2020                                           3.491364
21       Abr.2020                                           3.397500
22       May.2020                                           3.421150
23       Jun.2020                                           3.470167

erh, sorry couldnt find the link for the read json with multiple objects inside it. the thing is we cant use load/s for this kind of format. so have to use raw_decode() instead

this code should work

import pandas as pd
import json
import urllib.request as ur
from pprint import pprint

d = json.JSONDecoder()
url = 'https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/json/'

#reading and transforming json into list of dictionaries
data = []
with ur.urlopen(url) as json_file:
    x = json_file.read().decode() # decode to convert bytes string into normal string
    while True:
        try:
            j, n = d.raw_decode(x)
        except ValueError:
            break
        #print(j)
        data.append(j)
        x = x[n:]
#pprint(data)

#creating list of dictionaries to convert into dataframe
clean_list = []
for i, d in enumerate(data[0]['periods']):
    dict_data = {
        "month_year": d['name'],
        "value": d['values'][0],
    }
    clean_list.append(dict_data)
#print(clean_list)

#pd.options.display.width = 0
df = pd.DataFrame(clean_list)
print(df)

result

   month_year             value
0    Jul.2018  3.27659523809524
1    Ago.2018  3.28807142857143
2    Sep.2018          3.311325
3    Oct.2018  3.33390909090909
4    Nov.2018          3.374675
5    Dic.2018  3.36402631578947
6    Ene.2019  3.34386363636364
7    Feb.2019          3.321475
8    Mar.2019  3.30469047619048
9    Abr.2019          3.303825
10   May.2019  3.33236363636364
11   Jun.2019           3.32565
12   Jul.2019  3.29021428571428
13   Ago.2019           3.37756
14   Sep.2019  3.35735714285714
15   Oct.2019   3.3597619047619
16   Nov.2019            3.3717
17   Dic.2019  3.35519047619048
18   Ene.2020  3.32736363636364
19   Feb.2020           3.39035
20   Mar.2020  3.49136363636364
21   Abr.2020            3.3975
22   May.2020           3.42115
23   Jun.2020  3.47016666666667

if I somehow found the link again, I'll edit/comment my answer

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM