简体   繁体   中英

Azure blob container - Improve performance in fetching pdf files and files details from Azure blob container

Below is my code to fetch the pdf files from Azure blob container. Most of the times the blob container contains up to 4000 pdf files.

Issue i am facing: Performance is very slow and taking more time to fetch when there are more than 500 pdf files in the blob container.

Need inputs and help to correct my code to improve the performance. is there any way to improve the code below to fetch all the files and file details available in the blob container in 5 seconds time..

SearchQuery.cs file:


    List<WorkflowDocumentListModel> documentListModel = new List<WorkflowDocumentListModel>();
    if (request.RequestModel.WorkflowStatusId == (byte)WorkflowStatus.RCV) {
    IList<PdfFileMetaDataResponse> RCVFileList = await _blobManager.GetAllFilesMetaDataAsync(UserContext.AzureBlobContainerWorkflowQueue, cancellationToken).ConfigureAwait(false);
    foreach (PdfFileMetaDataResponse item in RCVFileList ) {
    documentListModel.Add(new WorkflowDocumentListModel {
      Id = 0,
      Name = item.FileName,
      AssignedTo = item.CustomUserName,
      AssignedToId = item.CustomUserId,
      AppDate = item.LastModified?.ClientTimeToUtc(UserContext.TimeZone),
      IsGrabbedByUser = userId == item.CustomUserId
    });
    }
    if (!string.IsNullOrEmpty(request.RequestModel.FullTextFilter?.Trim())) {
                        documentListModel = documentListModel.Where(x => x.Name.Contains(request.RequestModel.FullTextFilter))
                        .OrderBy(i => i.AppDate?.Date)
                        .ThenBy(i => i.AppDate?.TimeOfDay).ToList();                      
                    } else {
                       documentListModel = documentListModel
                      .OrderBy(i => i.AppDate?.Date)
                      .ThenBy(i => i.AppDate?.TimeOfDay).ToList();                    
                    }                  

                    return new WorkflowDocumentSearchRepsonse { Data = documentListModel, RecordCount = documentListModel.Count };
}

    
BlobManager.cs file method:
    
    
    public async Task<IList<PdfFileMetaDataResponse>> GetAllFilesMetaDataAsync(string containerName, CancellationToken cancellationToken) {
    BlobServiceClient _blobServiceClient = new BlobServiceClient(UserContext.AzureBlobStorageConnection);
    BlobContainerClient containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
    List<PdfFileMetaDataResponse> responseList = new List<PdfFileMetaDataResponse>();
    await foreach (BlobItem file in containerClient.GetBlobsAsync(cancellationToken: cancellationToken))
    {
    BlobClient blobClient = containerClient.GetBlobClient(file.Name);
    BlobProperties blobProperties = await blobClient.GetPropertiesAsync(cancellationToken: cancellationToken).ConfigureAwait(false);    
    DateTime? customLastModifiedDate = ConvertWebApiFileLastModifiedMillisecondsToDateTime(
    blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataLastModifiedMilliseconds.ToUpperInvariant())                                                     .Select(i => i.Value).FirstOrDefault());                                                                           
    string customUserName = blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataUserName.ToUpperInvariant())                                                                                .Select(i => i.Value).FirstOrDefault();                                                                                
    bool hasCustomUserID = int.TryParse(blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataUserId.ToUpperInvariant())                                                             .Select(i => i.Value).FirstOrDefault(), out int customUserID);    
    PdfFileMetaDataResponse response = new PdfFileMetaDataResponse(){
    FileName = file.Name,
    CustomUserName = customUserName,
    CustomUserId = hasCustomUserID ? customUserID : (int?)null,
    LastModified = customLastModifiedDate ?? file.Properties.LastModified?.UtcDateTime
    };    
    responseList.Add(response);
    }    
    return responseList;
    }
    

Thankyou.

When you call ContainerClient.GetBlobsAsync , the blob properties are automatically fetched. So, you need not call GetPropertiesAsync for each blob. That should improve the speed.

I also noticed that your code checks for blob's metadata (and probably that's why you are getting properties of each blob again). However you can get the metadata of the blob as part of blob listing operation. You just need to include BlobTraits when listing blobs.

So your list blobs code would be something like:

await foreach (BlobItem file in containerClient.GetBlobsAsync(traits: BlobTraits.Metadata, cancellationToken: cancellationToken))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM