简体   繁体   English

Azure blob存储 - 上传速度非常慢

[英]Azure blob storage - uploading very slow

I have the trial account in the azure blob storage. 我在azure blob存储中有试用帐户。 I try to upload 100000 generated files from my local machine. 我尝试从本地计算机上传100000个生成的文件。 The operation already have duration over 17 hours and uploaded only ~77000 files. 该操作已持续超过17个小时,仅上传了约77000个文件。 All files created by a simple bash-script: 由简单的bash脚本创建的所有文件:

for i in {1..100000}
do
    echo $i
    echo $i > $1\\$i.txt
done

Code for the uploading: 上传代码:

using(var stream = File.OpenWrite(textBoxManyUploadFileName.Text))
using(var writer = new StreamWriter(stream)) {
    foreach(var file in Directory.GetFiles(textBoxManyUploadFrom.Text)) {
        Guid id = Guid.NewGuid();
        storage.StoreFile(file, id, ((FileType)comboBoxManyUploadTypes.SelectedItem).Number);
        writer.WriteLine("{0}={1}", id, file);
    }
}

public void StoreFile(Stream stream, Guid id, string container) {
    try {
        var blob = GetBlob(id, container);
        blob.UploadFromStream(stream);
    } catch(StorageException exception) {
        throw TranslateException(exception, id, container);
    }
}

public void StoreFile(string filename, Guid id, int type = 0) {
    using(var stream = File.OpenRead(filename)) {
        StoreFile(stream, id, type);
    }
}

CloudBlob GetBlob(Guid id, string containerName) {
    var container = azureBlobClient.GetContainerReference(containerName);
    if(container.CreateIfNotExist()) {
        container.SetPermissions(new BlobContainerPermissions {
            PublicAccess = BlobContainerPublicAccessType.Container
        });
    }
    return container.GetBlobReference(id.ToString());
}

The first 10000 files have bean uploaded by 20-30 minutes then the speed decreased. 第一个10000文件的上传时间为20-30分钟,然后速度下降。 I think it may due to the fact that the file names are GUID and Azure tries to build the clustered index. 我认为这可能是由于文件名是GUID而Azure尝试构建聚簇索引。 How to speed up? 怎么加快? What is the problem? 问题是什么?

To upload many small files, you should use multiple threads. 要上传许多小文件,您应该使用多个线程。 You can use BeginUploadFromStream or Parallel.ForEach for instance. 例如,您可以使用BeginUploadFromStreamParallel.ForEach

One more thing I noticed in your code is that you're calling GetBlob() function in your StoreFile() function which in turn calls CreateIfNotExist() function on your blob container. 我在你的代码中注意到的另一件事是你在StoreFile()函数中调用GetBlob()函数,该函数又调用blob容器上的CreateIfNotExist()函数。 Please note that this function also result in a call to Storage Service thus adding delay in your upload process (not to mention you're also charged for a storage transaction each time you call this function). 请注意,此功能还会导致对存储服务的调用,从而增加上传过程的延迟(更不用说每次调用此功能时,您还需要为存储事务付费)。

I would recommend that you call this function just once before starting your blob upload. 我建议你在开始blob上传之前只调用一次这个函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM