如何使用Python阅读此链接？熊猫还是BS4？

Question

I am trying to read the following link with Pandas: 我正在尝试阅读以下与Pandas的链接：

http://api.eia.gov/series/?api_key=3d82a096b5e846caa05ddc8e747a7fd&series_id=PET.WGIRIUS2.W http://api.eia.gov/series/?api_key=3d82a096b5e846caa05ddc8e747a7fd&series_id=PET.WGIRIUS2.W

I've tried use pd.read_json() , which returned an error, ValueError: Mixing dicts with non-Series may lead to ambiguous ordering. 我试过使用pd.read_json() ，它返回错误ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

I tried using pd.read_csv which returns a DataFrame without any rows, and all of the columns are in a list. 我尝试使用pd.read_csv返回没有任何行的DataFrame，并且所有列都在列表中。

This is the first part of my code: 这是我的代码的第一部分：

import pandas as pd

eia_key='<private>'

def link_category(id_number):
    return 'http://api.eia.gov/category/?api_key='+eia_key+'&category_id='+id_number

def link_series(id_number):
    return 'http://api.eia.gov/series/?api_key='+eia_key+'&series_id='+id_number


'''U.S. Gross Inputs into Refineries, Weekly'''

page=link_series('PET.WGIRIUS2.W')

Then I try: 然后我尝试：

df=pd.read_csv(page)

and I get a mess with all of the table values as column names... but if I try 我将所有表值作为列名弄乱了...但是如果我尝试

df=pd.read_json(page)

and I get the error mentioned above... 我得到上面提到的错误...

Any suggestions on the best way to read these EIA datasets with Python? 关于使用Python读取这些EIA数据集的最佳方法有何建议？ I am open to using another library, like BS4 if that would be better. 我愿意使用另一个库，例如BS4，如果那样会更好。

Thank you in advance!!!! 先感谢您！！！！

Answer 1

I believe you just want the data field out of the response. 我相信您只是希望响应中没有data字段。

import json, requests
d = json.loads(requests.get(page).text)
df = pd.DataFrame(d['series'][0]['data'])

df will get you the data you want I believe. df将为您提供您想要的数据。

如何使用Python阅读此链接？熊猫还是BS4？

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-09-23 23:05:24

如何使用Python阅读此链接？ 熊猫还是BS4？

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-09-23 23:05:24

如何使用Python阅读此链接？熊猫还是BS4？

解决方案1
1 已采纳 2017-09-23 23:05:24