简体   繁体   中英

Upload blocks in parallel in blob storage

I am trying to convert this to parallel to improve the upload times of a file but with what I have tried it has not had great changes in time. I want to upload the blocks side-by-side and then confirm them. How could I manage to do it in parallel?

public static async Task UploadInBlocks
    (BlobContainerClient blobContainerClient, string localFilePath, int blockSize)
{
    string fileName = Path.GetFileName(localFilePath);
    BlockBlobClient blobClient = blobContainerClient.GetBlockBlobClient(fileName);

    FileStream fileStream = File.OpenRead(localFilePath);

    ArrayList blockIDArrayList = new ArrayList();

    byte[] buffer;

    var bytesLeft = (fileStream.Length - fileStream.Position);

    while (bytesLeft > 0)
    {
        if (bytesLeft >= blockSize)
        {
            buffer = new byte[blockSize];
            await fileStream.ReadAsync(buffer, 0, blockSize);
        }
        else
        {
            buffer = new byte[bytesLeft];
            await fileStream.ReadAsync(buffer, 0, Convert.ToInt32(bytesLeft));
            bytesLeft = (fileStream.Length - fileStream.Position);
        }

        using (var stream = new MemoryStream(buffer))
        {
            string blockID = Convert.ToBase64String
                (Encoding.UTF8.GetBytes(Guid.NewGuid().ToString()));
            
            blockIDArrayList.Add(blockID);


            await blobClient.StageBlockAsync(blockID, stream);
        }

        bytesLeft = (fileStream.Length - fileStream.Position);

    }

    string[] blockIDArray = (string[])blockIDArrayList.ToArray(typeof(string));

    await blobClient.CommitBlockListAsync(blockIDArray);
}

Of course. You shouldn't expect any improvements - quite the opposite. Blob storage doesn't have any simplistic throughput throttling that would benefit from uploading in multiple streams, and you're already doing extremely light-weight I/O which is going to be entirely I/O bound.

Good I/O code has absolutely no benefits from parallelization. No matter how many workers you put on the job, the pipe is only this thick and will not allow you to pass more data through.

All your code just reimplements the already very efficient mechanisms that the blob storage library has... and you do it considerably worse, with pointless allocation, wrong arguments and new opportunities for bugs. Don't do that. The library can deal with streams just fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM