简体   繁体   English

Azure blob 容器 - 提高从 Azure blob 容器中获取 pdf 文件和文件详细信息的性能

[英]Azure blob container - Improve performance in fetching pdf files and files details from Azure blob container

Below is my code to fetch the pdf files from Azure blob container.下面是我从 Azure blob 容器中获取 pdf 文件的代码。 Most of the times the blob container contains up to 4000 pdf files.大多数情况下,blob 容器最多包含 4000 个 pdf 文件。

Issue i am facing: Performance is very slow and taking more time to fetch when there are more than 500 pdf files in the blob container.我面临的问题:当 blob 容器中有超过 500 个 pdf 文件时,性能非常慢并且需要更多时间来获取。

Need inputs and help to correct my code to improve the performance.需要输入并帮助纠正我的代码以提高性能。 is there any way to improve the code below to fetch all the files and file details available in the blob container in 5 seconds time..有什么方法可以改进下面的代码,以在 5 秒内获取 blob 容器中可用的所有文件和文件详细信息。

SearchQuery.cs file:


    List<WorkflowDocumentListModel> documentListModel = new List<WorkflowDocumentListModel>();
    if (request.RequestModel.WorkflowStatusId == (byte)WorkflowStatus.RCV) {
    IList<PdfFileMetaDataResponse> RCVFileList = await _blobManager.GetAllFilesMetaDataAsync(UserContext.AzureBlobContainerWorkflowQueue, cancellationToken).ConfigureAwait(false);
    foreach (PdfFileMetaDataResponse item in RCVFileList ) {
    documentListModel.Add(new WorkflowDocumentListModel {
      Id = 0,
      Name = item.FileName,
      AssignedTo = item.CustomUserName,
      AssignedToId = item.CustomUserId,
      AppDate = item.LastModified?.ClientTimeToUtc(UserContext.TimeZone),
      IsGrabbedByUser = userId == item.CustomUserId
    });
    }
    if (!string.IsNullOrEmpty(request.RequestModel.FullTextFilter?.Trim())) {
                        documentListModel = documentListModel.Where(x => x.Name.Contains(request.RequestModel.FullTextFilter))
                        .OrderBy(i => i.AppDate?.Date)
                        .ThenBy(i => i.AppDate?.TimeOfDay).ToList();                      
                    } else {
                       documentListModel = documentListModel
                      .OrderBy(i => i.AppDate?.Date)
                      .ThenBy(i => i.AppDate?.TimeOfDay).ToList();                    
                    }                  

                    return new WorkflowDocumentSearchRepsonse { Data = documentListModel, RecordCount = documentListModel.Count };
}

    
BlobManager.cs file method:
    
    
    public async Task<IList<PdfFileMetaDataResponse>> GetAllFilesMetaDataAsync(string containerName, CancellationToken cancellationToken) {
    BlobServiceClient _blobServiceClient = new BlobServiceClient(UserContext.AzureBlobStorageConnection);
    BlobContainerClient containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
    List<PdfFileMetaDataResponse> responseList = new List<PdfFileMetaDataResponse>();
    await foreach (BlobItem file in containerClient.GetBlobsAsync(cancellationToken: cancellationToken))
    {
    BlobClient blobClient = containerClient.GetBlobClient(file.Name);
    BlobProperties blobProperties = await blobClient.GetPropertiesAsync(cancellationToken: cancellationToken).ConfigureAwait(false);    
    DateTime? customLastModifiedDate = ConvertWebApiFileLastModifiedMillisecondsToDateTime(
    blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataLastModifiedMilliseconds.ToUpperInvariant())                                                     .Select(i => i.Value).FirstOrDefault());                                                                           
    string customUserName = blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataUserName.ToUpperInvariant())                                                                                .Select(i => i.Value).FirstOrDefault();                                                                                
    bool hasCustomUserID = int.TryParse(blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataUserId.ToUpperInvariant())                                                             .Select(i => i.Value).FirstOrDefault(), out int customUserID);    
    PdfFileMetaDataResponse response = new PdfFileMetaDataResponse(){
    FileName = file.Name,
    CustomUserName = customUserName,
    CustomUserId = hasCustomUserID ? customUserID : (int?)null,
    LastModified = customLastModifiedDate ?? file.Properties.LastModified?.UtcDateTime
    };    
    responseList.Add(response);
    }    
    return responseList;
    }
    

Thankyou.谢谢你。

When you call ContainerClient.GetBlobsAsync , the blob properties are automatically fetched.当您调用ContainerClient.GetBlobsAsync时,会自动获取 blob 属性。 So, you need not call GetPropertiesAsync for each blob.因此,您无需为每个 blob 调用GetPropertiesAsync That should improve the speed.那应该会提高速度。

I also noticed that your code checks for blob's metadata (and probably that's why you are getting properties of each blob again).我还注意到您的代码会检查 blob 的元数据(这可能就是您再次获取每个 blob 的属性的原因)。 However you can get the metadata of the blob as part of blob listing operation.但是,您可以在 blob 列表操作中获取 blob 的元数据。 You just need to include BlobTraits when listing blobs.您只需要在列出 blob 时包含BlobTraits

So your list blobs code would be something like:因此,您的列表 blob 代码将类似于:

await foreach (BlobItem file in containerClient.GetBlobsAsync(traits: BlobTraits.Metadata, cancellationToken: cancellationToken))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 重命名 Azure Blob 容器中的文件 - Rename files in Azure Blob Container 如何在没有文件路径的情况下将文件上传到 Azure Blob Storage 容器的根目录 - How to upload files to the root of a Azure Blob Storage container without the file path 如何处理 Azure Storage Blob Container 和 Cloud Blob 的依赖异常? - How to handle dependency exceptions of Azure Storage Blob Container and Cloud Blob? 上传/获取 Blob - 在 .NET 核心中使用 Azure 存储帐户创建容器 - Upload/Get Blob - Create Container with Azure Storage Account in .NET Core 能够上传,下载和删除,但不能在Azure Blob容器中列出 - Able to upload, download, and delete but not list in Azure Blob Container 在使用 C# 代码创建 Blob 容器时创建和部署 Azure 函数 - Create and Deploy an Azure Function at Blob Container Creation in C# Code Azure Blob 存储 - container.GetBlobsByHierarchyAsync - 不返回结果 - Azure Blob Storage - container.GetBlobsByHierarchyAsync - not returning result 如何将大文件上传到 Azure Blob 存储 (.NET Core) - How to upload big files to Azure Blob Storage (.NET Core) Azure Blob 存储 SDK 12 通过 GZipStream 压缩文件不起作用 - Azure Blob Storage SDK 12 compress files by GZipStream not working 如何在托管 azure 应用服务中将大文件上传到 blob 存储 - How to upload large files to blob storage in hosted azure app service
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM