简体   繁体   中英

How to download text file from website using Python?

I need to write a function that downloads and stores the today's list of pre-release domains .txt file from http://www.namejet.com/pages/downloads.aspx. So as today is 8th of October you want to get the file "Monday, October 08, 2012". Tried with requests but didn't work. I'm having trouble because the file is not stored on a fixed URL but is hidden behind some Javascript.

This one's a little tricky as you're dealing with ASP.NET's postback system. If this is for anything other than a personal script, I'd be wary as you're effectively not only using another site's data, but reverse engineering their software as well (however, IANAL and have no idea about legalities around these issues in web systems).

What you're going to want to do is check the POST data (using Firebug, Chrome developer tools, etc) and look for the __EVENTTARGET and __VIEWSTATE attributes of the form object. You'll have to decode the __VIEWSTATE to be readable (check out http://ignatu.co.uk/ViewStateDecoder.aspx ). From there, I think you should be able to figure out how to get the data you're looking for.

From Python, it's as easy as:

from urllib2 import urlopen
from urllib import urlencode

data = urlopen('url', urlencode({
    '__VIEWSTATE': 'foo',
    '__EVENTTARGET': 'bar',
})).read()

Actually you get text file in response to a POST request with several base64-encoded request parameters. Feel free to play with it

use Firebug or any other debug tool to see the POST content and parameters

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM