简体   繁体   中英

Downloading file from “javascript:__doPostBack” link using Python

I have an existing Python script that was written using urllib2 to download from a http:// link:

import urllib2
import os.path
import os
from os import chdir, getcwd, listdir, path

print "downloading with urllib2"

f = urllib2.urlopen('http://www.dcregs.dc.gov/Notice/DownLoad.aspx?VersionID=4613531')
data = f.read()

with open( "11-B300.doc", "wb" ) as code :
code.write( data )

print "All done downloads!"

The source web-page has been reformatted to uses a "javascript:__doPostBack" address:

javascript:__doPostBack('ctl00$MainContent$rpt_ruleList$ctl02$Label1','')

My presumption is that there is some form of package, similar to urllib2, that will allow me to download the same information via the "javascript:__doPostBack" formatted address or to call the http url, where the information is located, from which I can then download the information.

The existing script was working well for my purposes, so I would like to limit the additional coding, if possible.

Is there an alternate to urllib2 that will allow me to do download the information in a similar manner?

Or am I going to have to get more sophisticated in my solution (eg, using Selenium to scrape the information)? (Do I want to get more sophisticated so that I don't have to manage updates to individual urls?)

Thanks for your help in advance.

This relates to the site that you're on using is using .NET WebForms which manages the state of the page & the interaction within hidden form variables.

So in short, you'll need to click the link via something like Selenium as you say

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM