简体   繁体   English

如何使用Python从网站下载文本文件?

[英]How to download text file from website using Python?

I need to write a function that downloads and stores the today's list of pre-release domains .txt file from http://www.namejet.com/pages/downloads.aspx. 我需要编写一个函数,该函数可以从http://www.namejet.com/pages/downloads.aspx.下载并存储今天的预发布域.txt文件列表http://www.namejet.com/pages/downloads.aspx. So as today is 8th of October you want to get the file "Monday, October 08, 2012". 因此,由于今天是10月8日,因此您想要获取文件“ 2012年10月8日,星期一”。 Tried with requests but didn't work. 尝试了请求,但没有成功。 I'm having trouble because the file is not stored on a fixed URL but is hidden behind some Javascript. 我遇到了麻烦,因为该文件未存储在固定的URL上,但隐藏在某些Javascript后面。

This one's a little tricky as you're dealing with ASP.NET's postback system. 在处理ASP.NET的回发系统时,这有些棘手。 If this is for anything other than a personal script, I'd be wary as you're effectively not only using another site's data, but reverse engineering their software as well (however, IANAL and have no idea about legalities around these issues in web systems). 如果这不是用于个人脚本,则我会警惕,因为您不仅在有效地使用另一个站点的数据,而且还对它们的软件进行了反向工程(但是,IANAL并不了解有关这些问题的合法性)系统)。

What you're going to want to do is check the POST data (using Firebug, Chrome developer tools, etc) and look for the __EVENTTARGET and __VIEWSTATE attributes of the form object. 您要做的是检查POST数据(使用Firebug,Chrome开发人员工具等),然后查找表单对象的__EVENTTARGET__VIEWSTATE属性。 You'll have to decode the __VIEWSTATE to be readable (check out http://ignatu.co.uk/ViewStateDecoder.aspx ). 您必须解码__VIEWSTATE才能使其可读(请查看http://ignatu.co.uk/ViewStateDecoder.aspx )。 From there, I think you should be able to figure out how to get the data you're looking for. 从那里,我认为您应该能够弄清楚如何获取所需的数据。

From Python, it's as easy as: 在Python中,它非常简单:

from urllib2 import urlopen
from urllib import urlencode

data = urlopen('url', urlencode({
    '__VIEWSTATE': 'foo',
    '__EVENTTARGET': 'bar',
})).read()

Actually you get text file in response to a POST request with several base64-encoded request parameters. 实际上,您将获得文本文件,以响应具有多个base64编码的请求参数的POST请求。 Feel free to play with it 随意玩

use Firebug or any other debug tool to see the POST content and parameters 使用Firebug或任何其他调试工具查看POST的内容和参数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Python 从需要登录信息的网站下载文件? - How to download file from website that requires login information using Python? 如何使用Python从网页下载文本文件或某些对象? - How to download a text file or some objects from webpage using Python? 无法使用 Python 从网站下载文件 - Can't download a file from a website using Python 从需要使用python进行身份验证的网站上下载文件 - download a file from a website which requires authetication using python 使用Python将最新版本的文件从网站下载到特定位置 - Download latest version of file from website to specific location using Python 我如何使用 jupyter 笔记本的 python 代码从网站下载 csv 文件 - how do i download the csv file from a website using python code for my jupyter notebook 如何通过单击按钮并使用 Python 从特定网站下载 excel 文件? - How to download excel file from a specific website by clicking a button and using Python? 如何使用 Python 通过 JavaScript 按钮从网站下载.XML 文件 - How to Download .XML File From Website Through JavaScript Button Using Python 如何从网站下载文件 - How to download file from website 如何使用Python触发从网站下载文件? - How do I use Python to trigger the download of a file from a website?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM