简体   繁体   中英

Python Azure blob storage upload file bigger then 64 MB

From the sample code, I can upload 64MB, without any problem:

   myblob = open(r'task1.txt', 'r').read()
   blob_service.put_blob('mycontainer', 'myblob', myblob, x_ms_blob_type='BlockBlob')

What if I want to upload bigger size?

Thank you

I ran into the same problem a few days ago, and was lucky enough to find this . It breaks up the file into chunks and uploads it for you.

I hope this helps. Cheers!

I'm not a Python programmer. But a few extra tips I can offer (my stuff is all in C):

Use HTTP PUT operations(comp=block option) for as many Blocks (4MB each) as required for your file, and then use a final PUT Block List (comp=blocklist option) that coalesces the Blocks. If your Block uploads fail or you need to abort, the cleanup for deleting the partial set of Blocks previously uploaded is a DELETE command for the file you are looking to create, but this appears supported by the 2013-08-15 version only (Someone from the Azure support should confirm this).

If you need to add Meta information, an additional PUT operation (with the comp=metadata prefix) is what I do when using the Block List method. There might be a more efficient way to tag the meta information without requiring an additional PUT, but I'm not aware of it.

This is good question. Unfortunately I don't see a real implementation for uploading arbitrary large files. So, from what I see there is much more work to do on the Python SDK, unless I am missing something really crucial.

The sample code provided in the documentation indeed uses just a single text file and uploads at once. There is no real code that is yet implemented (from what I see in the SDK Source code ) to support upload of larger files.

So, for you, to work with Blobs from Python you need to understand how Azure Blob Storage works. Start here .

Then take a quick look at the REST API documentation for PutBlob operation . It is mentioned in the remarks:

The maximum upload size for a block blob is 64 MB. If your blob is larger than 64 MB, you must upload it as a set of blocks. For more information, see the Put Block (REST API) and Put Block List (REST API) operations. It's not necessary to call Put Blob if you upload the blob as a set of blocks.

The good news is that PutBlock and PutBlockList is implemented in the Python SDK, but with no sample provided for how to use it. What you have to do is to manually split your file into chunks (blocks) of up to 4 MB each. and then use put_block(self, container_name, blob_name, block, blockid, content_md5=None, x_ms_lease_id=None): function from the python SDK to upload the blocks. Ultimately you will upload the blocks in parallel. Do not forget however that you have to execute also put_block_list(self, container_name, blob_name, block_list, content_md5=None, x_ms_blob_cache_control=None... at the end to commit all blocks uploaded.

Unfortunately I'm not Python expert to help you further, but at least I give you a good picture of the situation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM