简体   繁体   English

如何使用Python下载文件?

[英]How to Download Files using Python?

HI, everyone. 嗨,大家好。 I am new to Python and am using Python 2.5 on CentOS. 我是Python的新手,我在CentOS上使用Python 2.5。

I need to download files like WGET do. 我需要下载像WGET这样的文件。

I have done some search, and there are some solutions, an obvious way is this: 我做了一些搜索,有一些解决方案,一个明显的方法是:

import urllib2
mp3file = urllib2.urlopen("http://www.example.com/songs/mp3.mp3")
output = open('test.mp3','wb')
output.write(mp3file.read())
output.close()

This works fine. 这很好用。 But I want to know, if the mp3 file is VERY large, like 1Gb, 2Gb or even bigger. 但我想知道,如果mp3文件非常大,如1Gb,2Gb甚至更大。 Can this code snippet still work? 这段代码片段仍可以使用吗? Are there better ways to download large files in Python, maybe with a progress bar like WGET do. 有没有更好的方法来下载Python中的大文件,也许有像WGET那样的进度条。

Thanks a lot! 非常感谢!

There's an easier way: 有一种更简单的方法:

import urllib
urllib.urlretrieve("http://www.example.com/songs/mp3.mp3", "/home/download/mp3.mp3")

For really big files, your code would use a lot of memory, since you load the whole file into the memory at once. 对于非常大的文件,您的代码将使用大量内存,因为您将整个文件一次加载到内存中。 It might be better to read and write the data in chunks: 以块的形式读取和写入数据可能更好:

from __future__ import with_statement
import urllib2
mp3file = urllib2.urlopen("http://www.example.com/songs/mp3.mp3")
with open('test.mp3','wb') as output:
    while True:
        buf = mp3file.read(65536)
        if not buf:
            break
        output.write(buf)

Why not just call wget then? 为什么不直接调用wget呢?

import os
os.system ("wget http://www.example.com/songs/mp3.mp3")

your current code will read the entire stream into memory before writing to disk. 您的当前代码将在写入磁盘之前将整个流读入内存。 So for instances where the file is larger than your available memory, you will run into problems. 因此,对于文件大于可用内存的情况,您将遇到问题。

to resolve this, you can read chunks at a time and write them to file. 要解决此问题,您可以一次读取块并将它们写入文件。


(copied from Stream large binary files with urllib2 to file ) (从带有urllib2的Stream大二进制文件复制到文件

req = urllib2.urlopen(url)
CHUNK = 16 * 1024
with open(file, 'wb') as fp:
  while True:
    chunk = req.read(CHUNK)
    if not chunk: break
    fp.write(chunk)

"experiment a bit with various CHUNK sizes to find the "sweet spot" for your requirements." “根据您的要求,尝试使用各种CHUNK尺寸来找到”最佳位置“。”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM