简体   繁体   中英

How do I concatenate multiple CloudBlockBlob objects into one CloudBlockBlob object in Azure?

I am trying to concatenate multiple CloudBlockBlob objects into a single CloudBlockBlob object inside an Azure function. I have tried downloading the multiple objects into a memorystream and then uploading the memory stream to a new object but the function times out before the operation completes. I also tried writing to the new blob after each input blob read, but writing to a CloudBlockBlob overwrites the previous output. I am aware of CloudAppendBlob but I want the output file to be of type CloudBlockBlob.

Is there a better way to do this?

Here is my code which reads multiple CloudBlockBlobs into a memorystream and then writes that stream to a new CloudBlockBlob.

        public async Task CatBlob(string[] srcBlobs, string destinationBlob)
    {
        var connectionString = Config.AzConnStr;
        var container = Config.AzContainer;

        CloudStorageAccount storageAccount = null;
        CloudBlobContainer cloudBlobContainer = null;

        if (CloudStorageAccount.TryParse(connectionString, out storageAccount))
        {
            try
            {
                CloudBlobClient cloudBlobClient = storageAccount.CreateCloudBlobClient();
                cloudBlobContainer = cloudBlobClient.GetContainerReference(container);

                CloudBlockBlob blockBlobDest = cloudBlobContainer.GetBlockBlobReference("subfolder/test.zip");
                using (MemoryStream memStream = new MemoryStream())
                {
                    for (int i = 0; i < srcBlobs.Length; i++)
                    {
                        CloudBlockBlob cloudBlockBlobSrc = cloudBlobContainer.GetBlockBlobReference(srcBlobs[i]);
                        Console.WriteLine("loop {0}", i);
                        await cloudBlockBlobSrc.DownloadToStreamAsync(memStream);
                    }
                    memStream.Seek(0, SeekOrigin.Begin);
                    await blockBlobDest.UploadFromStreamAsync(memStream);
                }
            }
            catch (Exception ex)
            {
                Logger.Error("Exception while concatenating files: " + ex.Message, Context);
                throw;
            }
        }
        else
        {
            Logger.Error("Exception connecting to cloud storage while concatenating files", Context);
            throw new Exception("Could not connect to the Azure using Connection String.");
        }
    }

Please see this sample code below. It makes use ofAzure.Storage.Blobs (version 12.9.1) SDK. I have not tried to run this code so it may throw some errors.

Essentially the idea is that you download each blob separately and immediately store its contents as a block in the destination blob (currently you're creating a really big memory stream on the client). Once all blocks are uploaded, you commit the blocks to create the destination blob.

using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;
using Azure.Storage.Blobs.Specialized;

namespace SO68566758
{
    class Program
    {
        private const string connectionString = "your-connection-string";
        private const string container = "your-container-name";
        
        static async Task Main(string[] args)
        {
            string[] srcBlobs = new[] { "blob1.txt", "blob2.txt"};//Specify source blob names.
            string destinationBlob = "subfolder/test.zip";//Specify destination blob name.
            await CatBlob(srcBlobs, destinationBlob);
        }

        /// <summary>
        /// This method downloads the blobs specified in source blobs list (one at a time)
        /// and uploads the contents of that blob as a block in the destination blob. Once
        /// all blocks are uploaded, block list is committed to create the destination blob.
        /// </summary>
        /// <param name="srcBlobs"></param>
        /// <param name="destinationBlob"></param>
        public static async Task CatBlob(string[] srcBlobs, string destinationBlob)
        {
            BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
            BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(container);
            BlockBlobClient destinationBlobClient = containerClient.GetBlockBlobClient(destinationBlob);
            List<string> blockIds = new List<string>();
            for (var i = 0; i < srcBlobs.Length; i++)
            {
                BlockBlobClient sourceBlobClient = containerClient.GetBlockBlobClient(srcBlobs[i]);
                //Download source blob and read its contents as stream.
                BlobDownloadResult downloadResult = await sourceBlobClient.DownloadContentAsync();
                using (Stream stream = downloadResult.Content.ToStream())
                {
                    string blockId = Convert.ToBase64String(Encoding.UTF8.GetBytes(i.ToString("d6")));
                    stream.Position = 0;
                    //Upload that as a block in the destination blob.
                    await destinationBlobClient.StageBlockAsync(blockId, stream);
                    blockIds.Add(blockId);
                }
            }
            //All blobs have been uploaded. Now its time to commit the destination blob.
            await destinationBlobClient.CommitBlockListAsync(blockIds);
        }
    }
}

Please see Put Block and Put Block List REST API operations to understand more about how a block blob can be uploaded in chunks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM