从 URL 下载文件并将其保存在 Python 文件夹中

Question

我有很多文件类型为.docx和.pdf的 URL 我想运行一个 python 脚本，从 URL 下载它们并将其保存在一个文件夹中。 这是我为单个文件所做的工作，我会将它们添加到 for 循环中：

response = requests.get('http://wbesite.com/Motivation-Letter.docx')
with open("my_file.docx", 'wb') as f:
    f.write(response.content)

但它保存的my_file.docx只有 266 字节并且已损坏，但 URL 没问题。

更新：

添加了此代码并且它可以工作，但我想将它保存在一个新文件夹中。

import os
import shutil
import requests

def download_file(url, folder_name):
    local_filename = url.split('/')[-1]
    path = os.path.join("/{}/{}".format(folder_name, local_filename))
    with requests.get(url, stream=True) as r:
        with open(path, 'wb') as f:
            shutil.copyfileobj(r.raw, f)

    return local_filename

Answer 1

尝试使用流选项：

import os
import requests


def download(url: str, dest_folder: str):
    if not os.path.exists(dest_folder):
        os.makedirs(dest_folder)  # create folder if it does not exist

    filename = url.split('/')[-1].replace(" ", "_")  # be careful with file names
    file_path = os.path.join(dest_folder, filename)

    r = requests.get(url, stream=True)
    if r.ok:
        print("saving to", os.path.abspath(file_path))
        with open(file_path, 'wb') as f:
            for chunk in r.iter_content(chunk_size=1024 * 8):
                if chunk:
                    f.write(chunk)
                    f.flush()
                    os.fsync(f.fileno())
    else:  # HTTP status code 4XX/5XX
        print("Download failed: status code {}\n{}".format(r.status_code, r.text))


download("http://website.com/Motivation-Letter.docx", dest_folder="mydir")

请注意，上面示例中的mydir是当前工作目录中的文件夹名称。 如果mydir不存在，脚本将在当前工作目录中创建它并将文件保存在其中。 您的用户必须有权在当前工作目录中创建目录和文件。

您可以在dest_folder传递绝对文件路径，但请先检查权限。

PS：避免在一篇文章中提出多个问题

Answer 2

尝试：

import urllib.request 
urllib.request.urlretrieve(url, filename)

从 URL 下载文件并将其保存在 Python 文件夹中

问题描述

更新：

2 个解决方案

解决方案1
23 2019-07-09 11:03:12

解决方案2
5 2019-07-09 10:58:02

从 URL 下载文件并将其保存在 Python 文件夹中

问题描述

更新：

2 个解决方案

解决方案1 23 2019-07-09 11:03:12

解决方案2 5 2019-07-09 10:58:02

解决方案1
23 2019-07-09 11:03:12

解决方案2
5 2019-07-09 10:58:02