简体   繁体   中英

download a .gz file from a subdirectory in a s3 bucket using boto

I have a file named combine.gz which I need to download from a subfolder on s3 . I am able to get to the combine.gz files (specifically one per directory) but I am unable to find a method in boto to read the .gz files to my local machine.

All I can find are the boto.utils.fetch_file , key.get_contents_to_filename , key.get_contents_to_file methods all of which as I understand, directly stream the contents of the file.

Is there be a way for me to first read the compressed file in .gz format onto my local machine from S3 using boto and then uncompress it?

Any help would be much appreciated.

You can read the full contents as a string and then manage it as a string object. This is very dangerous and could lead to memory or buffer issues so be careful.

Check into using cStringIO.StringIO, gzip.GzipFile, and boto

datastring = key.get_contents_as_string()
data = cStringIO.StringIO(datastring)
rawdata = gzip.GzipFile(fileobj=data).read()

again - be careful as this has lots of memory and potential security issues in the event the gzip file is malformed. You'll want to wrap with try, except and code defensively if you don't control both sides.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM