This code will get you logged in as TrickyBen, and make a request to the website API...
import requests from lxml import html from requests import Session import pandas as pd import shutil
raceSession = Session()
LoginDetails = {'login': 'TrickyBen', 'password': 'TrickyBen123'}
LoginUrl = 'https://www.horseracebase.com/horse-racing-results.php?year=2005&month=3&day=15/horsebase1.php'
LoginPost = raceSession.post(LoginUrl, data=LoginDetails)
RaceUrl = 'https://www.horseracebase.com/excelresults.php'
RaceDataDetails = {"user": "41495", "racedate": "2005-3-15", "downloadbutton": "Excel"}
PostHeaders = {"Content-Type": "application/x-www-form-urlencoded"}
Response = raceSession.post(RaceUrl, data=RaceDataDetails, headers=PostHeaders)
Table = pd.read_table(Response.text)
Table.to_csv('blahblah.csv')
If you inspect element, you'll notice that the relevant element looks like this...
<form action="excelresults.php" method="post">
<input type="hidden" name="user" value="41495">
<input type="hidden" name="racedate" value="2005-3-15">
<input type="submit" class="downloadbutton" value="Excel">
</form>
I get this error message...
Traceback (most recent call last):
File "/Users/Alex/Desktop/DateTest/hrpull.py", line 20, in <module>
Table = pd.read_table(Response.text)
File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 562, in parser_f
return _read(filepath_or_buffer, kwds)
File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 315, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 645, in __init__
self._make_engine(self.engine)
File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 799, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/Library/Python/2.7/site-packages/pandas/io/parsers.py", line 1213, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 358, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:3427)
File "pandas/parser.pyx", line 628, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:6861)
IOError: File race_date race_time track race_name race_restrictions_age race_class major race_distance prize_money going_description number_of_runners place distbt horse_name stall trainer horse_age jockey_name jockeys_claim pounds odds fav official_rating comptime TotalDstBt MedianOR Dist_Furlongs placing_numerical RCode BFSP BFSP_Place PlcsPaid BFPlcsPaid Yards RailMove RaceType
"2005-03-15" "14:00:00" "Cheltenham" "Letheby & Christopher Supreme Novices Hurdle " "4yo+" "Class 1" "Grade 1" "2m˝f " "58000" "Good" "20" "1st" "Arcalis" "0" "Johnson, J Howard" "5" "Lee, G" "0" "161" "21" "136" "3 mins 53.00s" "121.5" "16.5" "1" "National Hunt" "0" "0" "3" "0" "0" "0" "Novices Hurdle"
"2005-03-15" "14:00:00" "Cheltenham" "Letheby & Christopher Supreme Novices Hurdle " "4yo+" "Class 1" "Grade 1" "2m˝f " "58000" "Good" "20" "2nd" "6" "Wild Passion (GER)" "0" "Meade, Noel" "5" "Carberry, P" "0" "161" "11" "0" "3 mins 53.00s" "6" "121.5" "16.5" "2" "National Hunt" "0" "0" "3" "0" "0" "0" "Novices Hurdle"
I'm thinking that you can see the data that you want to download in another web page, for example, by clicking on "My Systems (v4)". If you can do that, then you can use urllib.request.urlretrieve to download that page. And then you can use html.parser.HTMLParser to parse the data and do with as you wish.
If you would look at the api being called on the form action, you'll see that you've to make a post request to this url :
https://www.horseracebase.com/excelresults.php
with following parameters:
data = {
"user": "41495", # looks like this varies with login, so update in case you change your login id
"racedate": "2005-3-15",
"downloadbutton": "Excel"
}
you can do something like this:
response = raceSession.post(reqUrl, json=data)
If this doesn't work, try adding headers to the request like: headers=postHeaders
. For ex. you should set the content type header in this case, as you're sending form encoded data, so:
headers = {"Content-Type": "application/x-www-form-urlencoded"}
Read this for more info on how to save the excel to a file
Here's the response for this request in Postman, so looks like you won't need any additional headers except the content-type
:
EDIT
This is what you need to do:
raceSession = Session()
RaceUrl = 'https://www.horseracebase.com/excelresults.php'
RaceDataDetails = {"user": "41495", "racedate": "2005-3-15", "downloadbutton": "Excel"}
PostHeaders = {"Content-Type": "application/x-www-form-urlencoded"}
Response = raceSession.post(RaceUrl, data=RaceDataDetails, headers=PostHeaders)
# from StringIO import StringIO #for python 2.x
#import StringIO #for python 3.x
Table = pd.read_table(StringIO(Response.text))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.