简体   繁体   中英

How to download a large file with httplib2

Is it possible to download a large file in chunks using httplib2. I am downloading files from a Google API, and in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2.

At the moment I am doing:

flow = OAuth2WebServerFlow(
    client_id=XXXX,
    client_secret=XXXX,
    scope=XYZ,
    redirect_uri=XYZ
)

credentials = flow.step2_exchange(oauth_code)

http = httplib2.Http()
http = credentials.authorize(http)

resp, content = self.http.request(url, "GET")
with open(file_name, 'wb') as fw:
    fw.write(content)

But the content variable can get more than 500MB.

Any way of reading the response in chunks?

You could consider streaming_httplib2 , a fork of httplib2 with exactly that change in behaviour.

in order to use the credentials from the google OAuth2WebServerFlow, I am bound to use httplib2.

If you need features that aren't available in httplib2, it's worth looking at how much work it would be to get your credential handling working with another HTTP library. It may be a good longer-term investment. (eg How to download large file in python with requests.py? .)

About reading response in chunks (works with httplib, must work with httplib2)

import httplib
conn = httplib.HTTPConnection("google.com")
conn.request("GET", "/")
r1 = conn.getresponse()

try:
    print r1.fp.next()
    print r1.fp.next()
except:
    print "Exception handled!"

Note: next() may raise StopIteration exception, you need to handle it.

You can avoid calling next() like this

F=open("file.html","w")
for n in r1.fp:
    F.write(n)
    F.flush()

You can apply oauth2client.client.Credentials to a urllib2 request.

First, obtain the credentials object. In your case, you're using:

credentials = flow.step2_exchange(oauth_code)

Now, use that object to get the auth headers and add them to the urllib2 request:

req = urllib2.Request(url)
auth_headers = {}
credentials.apply(auth_headers)
for k,v in auth_headers.iteritems():
  req.add_header(k,v)
resp = urllib2.urlopen(req)

Now resp is a file-like object that you can use to read the contents of the URL

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM