简体   繁体   中英

How to download a text file or some objects from webpage using Python?

I am writing a function that downloads and stores the today's list of pre-release domains .txt file from http://www.namejet.com/pages/downloads.aspx . I am trying to achieve it using json.

import json
import requests

def hello():
    r = requests.get('http://www.namejet.com/pages/downloads.aspx') 
    #Replace with your website URL

    with open("a.txt", "w") as f: 
    #Replace with your file name
        for item in r.json or []:
            try:
                f.write(item['name']['name'] + "\n") 
            except KeyError: 
                pass  

hello()

I need to download the file which consist of pre-release domains using python. How can I do that? Is the above code right way to do it?

I dont't think mechanize is much use for javascript, use selenium . Here's an example:

In [1]: from selenium import webdriver
In [2]: browser=webdriver.Chrome() # Select browser that you want to automate 
In [3]: browser.get('http://www.namejet.com/pages/downloads.aspx')
In [4]: element=browser.find_element_by_xpath(
            '//a[@id="ctl00_ContentPlaceHolder1_hlPreRelease1"]')

In [5]: element.click()

Now you can find prerelease_10-08-2012.txt in your download folder and you can open it in a usual way.

I see a few problems with your approach:

  1. The page doesn't return any json; so even if you were to access the page successfully, r.json will be empty:

     >>> import requests >>> r = requests.get('http://www.namejet.com/pages/downloads.aspx') >>> r.json 
  2. The file that you are after, is hidden behind a postback link; which you cannot "execute" using requests as it will not understand javascript.

In light of the above, the better approach is to use mechanize or alternatives to emulate a browser. You could also ask the company to provide you with a direct link.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM