简体   繁体   中英

Downloading all available CSV & KML files from the specific website

I am trying to use python to automate the process of downloading all the available CSV and KML files from data.gov.sg. However, we have gotten the "HTTP Error 403: Forbidden" error message. We used to get a robots.txt error which has been solved. Is there anything wrong with our coding below?

import mechanize
from time import sleep
br = mechanize.Browser()

br.open('https://data.gov.sg/')

f=open("source.html","w")
f.write(br.response().read()) 
f.close()

filetypes=[".csv",".kml"] 
myfiles=[]
for l in br.links(): 
    for t in filetypes:
        if t in str(l): 
            myfiles.append(l)


def downloadlink(l):
    f=open(l.text,"w") 
    br.click_link(l)
    f.write(br.response().read())
    f.close()
    print l.text," has been downloaded"
#br.back()

for l in myfiles:
    sleep(1) 
    downloadlink(l)

HTTP 403 error http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.4 means that you have been forbidden access, either you need to be authorised to access this, or the server administrator has blocked you from accessing it and sending a 403 response as the notification.

So there is nothing wrong with your code (although you appear to have lost indentation) that I can see that would cause this issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM