简体   繁体   English

使用python从网页下载文件,无需打开网页

[英]Download a file using python from a webpage without opening the webpage

I was looking for a way to write a script that will download a file from a specific website but without opening the website itself.我一直在寻找一种编写脚本的方法,该脚本将从特定网站下载文件但不打开网站本身。 I want everything to happen in the background.我希望一切都在后台发生。

The website is morningstar and a specific link for the example is this one: https://financials.morningstar.com/ratios/r.html?t=MSFT in this page, there is a "Button" (it is not really declared as a button but as a Hyperlink, the <a> tag in HTML)该网站是morningstar,示例的特定链接是: https : //financials.morningstar.com/ratios/r.html?t=MSFT在此页面中,有一个“按钮”(它并未真正声明)作为按钮但作为超链接,HTML 中的<a>标签)

I added a photo in the bottom so you can see for yourself exactly the way they wrote the code.我在底部添加了一张照片,这样您就可以亲眼看到他们编写代码的方式。

Anyway, I saw that when I clicked the button the href attribute actually calls a javascript function which then creates the links from which the file will be downloaded.无论如何,我看到当我单击按钮时,href 属性实际上调用了一个 javascript 函数,然后该函数创建了从中下载文件的链接。

I am looking for a way that I can write a script and give it the link I want, for example, the link above, and the script will download this specific CSV file from that page into a folder of my choice.我正在寻找一种方法来编写脚本并为其提供我想要的链接,例如上面的链接,脚本将从该页面将此特定 CSV 文件下载到我选择的文件夹中。

I was looking at some selenium tutorials but I couldn't find much help for my specific problem.我正在查看一些 selenium 教程,但我无法为我的特定问题找到太多帮助。

在此处输入图片说明

Here's an example I use:这是我使用的一个例子:

import requests

url = 'http://via.placeholder.com/350x150'

dashboardFile = requests.get(url, allow_redirects=True)

open('d:/dev/projects/new-wave/dashboard.pdf', 'wb').write(dashboardFile .content)

Oh, and depending on the size of the file you would want to download in chunks.哦,取决于您要分块下载的文件大小。 A quick search of: "python download large file in chunks" will help.快速搜索:“python 分块下载大文件”会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM