了解如何从HTML文件提取数据

Question

I am trying to access the "Yield Curve Data" available on this page . 我正在尝试访问此页面上的“屈服曲线数据”。 It has a radio button which upon clicking "Submit" results in a zip File, from which I am looking to get the data. 它具有一个单选按钮，单击“提交”后将生成一个zip文件，我正在从中获取数据。 I am looking to get the data from the "Retrieve all data" Option. 我希望从“检索所有数据”选项中获取数据。 My code is as follows, and from the statement print result.read() I realize that result is actually a HTML Document. 我的代码如下，从语句print result.read()我意识到result实际上是HTML文档。 My difficult is in understanding how to extract the data from result as I don't see any data in this. 我的困难在于理解如何从result提取数据，因为我在其中看不到任何数据。 I am confused as to where to go from here. 我对从这里去哪里感到困惑。

import urllib, urllib2
import csv
from StringIO import StringIO
import pandas as pd
import os
from zipfile import ZipFile

my_url = 'http://www.bankofcanada.ca/rates/interest-rates/bond-yield-curves/'
data = urllib.urlencode({'lastchange': 'all'}) 
request = urllib2.Request(my_url, data)
result = urllib2.urlopen(request)

Thank You 谢谢

Answer 1

Your going to need to generate a post request to the following endpoint: 您需要向以下端点生成发布请求：

http://www.bankofcanada.ca/stats/results/csv

With the following form data: 使用以下表单数据：

lookupPage: lookup_yield_curve.php
startRange: 1986-01-01
searchRange: all

This should give you the file. 这应该给您文件。

You may also need to fake your useragent. 您可能还需要伪造您的useragent。

了解如何从HTML文件提取数据

问题描述

1 个解决方案

解决方案1
0 2015-06-17 17:07:30

了解如何从HTML文件提取数据

问题描述

1 个解决方案

解决方案1 0 2015-06-17 17:07:30

解决方案1
0 2015-06-17 17:07:30