简体   繁体   English

我无法从 api 正确地看到 json dataframe

[英]I can't correctly visualize a json dataframe from api

I am currently trying to read some data from a public API.我目前正在尝试从公共 API 读取一些数据。 It has different ways of reading (json, csv, txt, among others), just change the label in the url (/ json, / csv, / txt...). It has different ways of reading (json, csv, txt, among others), just change the label in the url (/ json, / csv, / txt...). The url is as follows: url如下:

https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/ https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/json/ ... https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/ https://estadisticas.bcrp.gob.pe/estadisticas/series/...api/PN01

My problem is that when trying to import into the Pandas dataframe it doesn't read the data correctly.我的问题是,当尝试导入 Pandas dataframe 时,它没有正确读取数据。 I am trying the following alternatives:我正在尝试以下替代方案:

import pandas as pd
import requests

url = 'https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/json/'
r = requests.get(url)

rjson = r.json()

df= json_normalize(rjson)


Also I try to read the data in csv format:我也尝试读取 csv 格式的数据:

import pandas as pd
import requests

url = 'https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/'

collisions = pd.read_csv(url, sep='<br>')

But I don't get good results;但是我没有得到好的结果; the dataframe cannot be visualized correctly since the 'periods' column is grouped with all the values... dataframe 无法正确显示,因为“周期”列与所有值分组......

the output is displayed as follows: output显示如下:


all data appears as columns: /所有数据显示为列:/

Here is an example of how the data is displayed correctly:以下是如何正确显示数据的示例:


What alternative do you recommend trying?您建议尝试什么替代方案?

Thank you in advance for your time and help !!提前感谢您的时间和帮助!

I will be attentive to your answers, regards!我会注意你的回答,问候!

For csv you can use StringIO from io package对于csv您可以使用StringIO io中的 StringIO

In [20]: import requests

In [21]: res = requests.get("https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/csv/")

In [22]: import pandas as pd

In [23]: import io

In [24]: df = pd.read_csv(io.StringIO(res.text.strip().replace("<br>","\n")), engine='python')

In [25]: df
   Mes/A&ntilde;o  Tipo de cambio - promedio del periodo (S/ por US$) - Bancario - Promedio
0        Jul.2018                                           3.276595
1        Ago.2018                                           3.288071
2        Sep.2018                                           3.311325
3        Oct.2018                                           3.333909
4        Nov.2018                                           3.374675
5        Dic.2018                                           3.364026
6        Ene.2019                                           3.343864
7        Feb.2019                                           3.321475
8        Mar.2019                                           3.304690
9        Abr.2019                                           3.303825
10       May.2019                                           3.332364
11       Jun.2019                                           3.325650
12       Jul.2019                                           3.290214
13       Ago.2019                                           3.377560
14       Sep.2019                                           3.357357
15       Oct.2019                                           3.359762
16       Nov.2019                                           3.371700
17       Dic.2019                                           3.355190
18       Ene.2020                                           3.327364
19       Feb.2020                                           3.390350
20       Mar.2020                                           3.491364
21       Abr.2020                                           3.397500
22       May.2020                                           3.421150
23       Jun.2020                                           3.470167

erh, sorry couldnt find the link for the read json with multiple objects inside it.呃,抱歉找不到里面有多个对象的读取 json 的链接。 the thing is we cant use load/s for this kind of format.问题是我们不能对这种格式使用 load/s。 so have to use raw_decode() instead所以必须改用raw_decode()

this code should work这段代码应该可以工作

import pandas as pd
import json
import urllib.request as ur
from pprint import pprint

d = json.JSONDecoder()
url = 'https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01210PM/json/'

#reading and transforming json into list of dictionaries
data = []
with ur.urlopen(url) as json_file:
    x = json_file.read().decode() # decode to convert bytes string into normal string
    while True:
            j, n = d.raw_decode(x)
        except ValueError:
        x = x[n:]

#creating list of dictionaries to convert into dataframe
clean_list = []
for i, d in enumerate(data[0]['periods']):
    dict_data = {
        "month_year": d['name'],
        "value": d['values'][0],

#pd.options.display.width = 0
df = pd.DataFrame(clean_list)


   month_year             value
0    Jul.2018  3.27659523809524
1    Ago.2018  3.28807142857143
2    Sep.2018          3.311325
3    Oct.2018  3.33390909090909
4    Nov.2018          3.374675
5    Dic.2018  3.36402631578947
6    Ene.2019  3.34386363636364
7    Feb.2019          3.321475
8    Mar.2019  3.30469047619048
9    Abr.2019          3.303825
10   May.2019  3.33236363636364
11   Jun.2019           3.32565
12   Jul.2019  3.29021428571428
13   Ago.2019           3.37756
14   Sep.2019  3.35735714285714
15   Oct.2019   3.3597619047619
16   Nov.2019            3.3717
17   Dic.2019  3.35519047619048
18   Ene.2020  3.32736363636364
19   Feb.2020           3.39035
20   Mar.2020  3.49136363636364
21   Abr.2020            3.3975
22   May.2020           3.42115
23   Jun.2020  3.47016666666667

if I somehow found the link again, I'll edit/comment my answer如果我以某种方式再次找到该链接,我将编辑/评论我的答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM