简体   繁体   English

如何使用httplib2下载大文件

[英]How to download a large file with httplib2

Is it possible to download a large file in chunks using httplib2. 是否可以使用httplib2批量下载大文件。 I am downloading files from a Google API, and in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2. 我正在从Google API下载文件,并且为了使用来自Google OAuth2WebServerFlow的凭据,我必须使用httplib2。

At the moment I am doing: 目前,我正在执行以下操作:

flow = OAuth2WebServerFlow(
    client_id=XXXX,
    client_secret=XXXX,
    scope=XYZ,
    redirect_uri=XYZ
)

credentials = flow.step2_exchange(oauth_code)

http = httplib2.Http()
http = credentials.authorize(http)

resp, content = self.http.request(url, "GET")
with open(file_name, 'wb') as fw:
    fw.write(content)

But the content variable can get more than 500MB. 但是content变量可以获得超过500MB。

Any way of reading the response in chunks? 有什么办法可以分块读取响应吗?

You could consider streaming_httplib2 , a fork of httplib2 with exactly that change in behaviour. 您可以考虑Streaming_httplib2 ,它是httplib2的一个分支,具有确切的行为变化。

in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2. 为了使用来自Google OAuth2WebServerFlow的凭据,我必须使用httplib2。

If you need features that aren't available in httplib2, it's worth looking at how much work it would be to get your credential handling working with another HTTP library. 如果您需要httplib2中不提供的功能,则值得研究如何使凭证处理与另一个HTTP库一起使用将花费多少。 It may be a good longer-term investment. 这可能是一项不错的长期投资。 (eg How to download large file in python with requests.py? .) (例如, 如何使用request.py在python中下载大文件?

About reading response in chunks (works with httplib, must work with httplib2) 关于分块读取响应(与httplib一起使用,必须与httplib2一起使用)

import httplib
conn = httplib.HTTPConnection("google.com")
conn.request("GET", "/")
r1 = conn.getresponse()

try:
    print r1.fp.next()
    print r1.fp.next()
except:
    print "Exception handled!"

Note: next() may raise StopIteration exception, you need to handle it. 注意: next()可能会引发StopIteration异常,您需要对其进行处理。

You can avoid calling next() like this 您可以避免像这样调用next()

F=open("file.html","w")
for n in r1.fp:
    F.write(n)
    F.flush()

You can apply oauth2client.client.Credentials to a urllib2 request. 您可以将oauth2client.client.Credentials应用于urllib2请求。

First, obtain the credentials object. 首先,获取credentials对象。 In your case, you're using: 就您而言,您正在使用:

credentials = flow.step2_exchange(oauth_code)

Now, use that object to get the auth headers and add them to the urllib2 request: 现在,使用该对象获取auth标头,并将其添加到urllib2请求中:

req = urllib2.Request(url)
auth_headers = {}
credentials.apply(auth_headers)
for k,v in auth_headers.iteritems():
  req.add_header(k,v)
resp = urllib2.urlopen(req)

Now resp is a file-like object that you can use to read the contents of the URL 现在resp是一个类似于文件的对象,您可以使用它来读取URL的内容

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM