简体   繁体   English

如何使用Selenium和python下载HTML网页?

[英]How to download a HTML webpage using Selenium with python?

I want to download a webpage using selenium with python. 我想使用硒和python下载网页。 using the following code: 使用以下代码:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_argument('--save-page-as-mhtml')
d = DesiredCapabilities.CHROME
driver = webdriver.Chrome()

driver.get("http://www.yahoo.com")

saveas = ActionChains(driver).key_down(Keys.CONTROL)\
         .key_down('s').key_up(Keys.CONTROL).key_up('s')
saveas.perform()
print("done")

However the above code isnt working. 但是上述代码无法正常工作。 I am using windows 7. Is there any by which i can bring up the 'Save as" Dialog box? 我正在使用Windows7。是否可以打开“另存为”对话框?

Thanks Karan 谢谢卡兰

You can use below code to download page HTML : 您可以使用以下代码下载页面HTML

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.yahoo.com")
with open("/path/to/page_source.html", "w") as f:
    f.write(driver.page_source)

Just replace "/path/to/page_source.html" with desirable path to file and file name 只需将"/path/to/page_source.html"替换为所需的文件和文件名路径

Update 更新

If you need to get complete page source (including CSS , JS , ...), you can use following solution: 如果您需要获取完整的页面源代码(包括CSSJS ,...),则可以使用以下解决方案:

pip install pyahk # from command line

Python code: Python代码:

from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
import ahk

firefox = FirefoxBinary("C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe")
from selenium import webdriver

driver = web.Firefox(firefox_binary=firefox)
driver.get("http://www.yahoo.com")
ahk.start()
ahk.ready()
ahk.execute("Send,^s")
ahk.execute("WinWaitActive, Save As,,2")
ahk.execute("WinActivate, Save As")
ahk.execute("Send, C:\\path\\to\\file.htm")
ahk.execute("Send, {Enter}")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Python Selenium下载完整的网页 - How to download complete webpage using Python Selenium 如何使用 Selenium 和 python 下载 MHTML 网页? - How to download a MHTML webpage using Selenium with python? 使用 Selenium 和 Python 将整个网页下载为 HTML(包括 HTML 资产)而不另存为弹出窗口 - Download entire webpage as HTML (including the HTML assets) without save as pop up using Selenium and Python 通过Selenium Python下载整个网页(html,图像,JS) - Download entire webpage (html, image, JS) by Selenium Python Python:无法在网页中使用硒进行下载 - Python: Unable to download with selenium in webpage 如何使用python从网页下载文件 - How to download files from a webpage using python 如何在 python 上使用 selenium 打开一个新网页? - How to open a new webpage using selenium on python? 如何在网页中执行“javascript:__doPostBack”以使用 selenium 下载 pdf 文件? - How to execute “javascript:__doPostBack” in a webpage to download pdf files using selenium? 如何使用硒从网页下载嵌入的 PDF? - How to download embedded PDF from webpage using selenium? 使用 Selenium 和 Python 抓取 Morningstar 网站。 Selenium 不会下载完整的网页 - Using Selenium and Python to scrape Morningstar website. Selenium doesn't download the full webpage
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM