[英]Azure blob container - Improve performance in fetching pdf files and files details from Azure blob container
Below is my code to fetch the pdf files from Azure blob container.下面是我从 Azure blob 容器中获取 pdf 文件的代码。 Most of the times the blob container contains up to 4000 pdf files.大多数情况下,blob 容器最多包含 4000 个 pdf 文件。
Issue i am facing: Performance is very slow and taking more time to fetch when there are more than 500 pdf files in the blob container.我面临的问题:当 blob 容器中有超过 500 个 pdf 文件时,性能非常慢并且需要更多时间来获取。
Need inputs and help to correct my code to improve the performance.需要输入并帮助纠正我的代码以提高性能。 is there any way to improve the code below to fetch all the files and file details available in the blob container in 5 seconds time..有什么方法可以改进下面的代码,以在 5 秒内获取 blob 容器中可用的所有文件和文件详细信息。
SearchQuery.cs file:
List<WorkflowDocumentListModel> documentListModel = new List<WorkflowDocumentListModel>();
if (request.RequestModel.WorkflowStatusId == (byte)WorkflowStatus.RCV) {
IList<PdfFileMetaDataResponse> RCVFileList = await _blobManager.GetAllFilesMetaDataAsync(UserContext.AzureBlobContainerWorkflowQueue, cancellationToken).ConfigureAwait(false);
foreach (PdfFileMetaDataResponse item in RCVFileList ) {
documentListModel.Add(new WorkflowDocumentListModel {
Id = 0,
Name = item.FileName,
AssignedTo = item.CustomUserName,
AssignedToId = item.CustomUserId,
AppDate = item.LastModified?.ClientTimeToUtc(UserContext.TimeZone),
IsGrabbedByUser = userId == item.CustomUserId
});
}
if (!string.IsNullOrEmpty(request.RequestModel.FullTextFilter?.Trim())) {
documentListModel = documentListModel.Where(x => x.Name.Contains(request.RequestModel.FullTextFilter))
.OrderBy(i => i.AppDate?.Date)
.ThenBy(i => i.AppDate?.TimeOfDay).ToList();
} else {
documentListModel = documentListModel
.OrderBy(i => i.AppDate?.Date)
.ThenBy(i => i.AppDate?.TimeOfDay).ToList();
}
return new WorkflowDocumentSearchRepsonse { Data = documentListModel, RecordCount = documentListModel.Count };
}
BlobManager.cs file method:
public async Task<IList<PdfFileMetaDataResponse>> GetAllFilesMetaDataAsync(string containerName, CancellationToken cancellationToken) {
BlobServiceClient _blobServiceClient = new BlobServiceClient(UserContext.AzureBlobStorageConnection);
BlobContainerClient containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
List<PdfFileMetaDataResponse> responseList = new List<PdfFileMetaDataResponse>();
await foreach (BlobItem file in containerClient.GetBlobsAsync(cancellationToken: cancellationToken))
{
BlobClient blobClient = containerClient.GetBlobClient(file.Name);
BlobProperties blobProperties = await blobClient.GetPropertiesAsync(cancellationToken: cancellationToken).ConfigureAwait(false);
DateTime? customLastModifiedDate = ConvertWebApiFileLastModifiedMillisecondsToDateTime(
blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataLastModifiedMilliseconds.ToUpperInvariant()) .Select(i => i.Value).FirstOrDefault());
string customUserName = blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataUserName.ToUpperInvariant()) .Select(i => i.Value).FirstOrDefault();
bool hasCustomUserID = int.TryParse(blobProperties.Metadata?.Where(i => i.Key.ToUpperInvariant() == CustomBlobMetadataUserId.ToUpperInvariant()) .Select(i => i.Value).FirstOrDefault(), out int customUserID);
PdfFileMetaDataResponse response = new PdfFileMetaDataResponse(){
FileName = file.Name,
CustomUserName = customUserName,
CustomUserId = hasCustomUserID ? customUserID : (int?)null,
LastModified = customLastModifiedDate ?? file.Properties.LastModified?.UtcDateTime
};
responseList.Add(response);
}
return responseList;
}
Thankyou.谢谢你。
When you call ContainerClient.GetBlobsAsync
, the blob properties are automatically fetched.当您调用ContainerClient.GetBlobsAsync
时,会自动获取 blob 属性。 So, you need not call GetPropertiesAsync
for each blob.因此,您无需为每个 blob 调用GetPropertiesAsync
。 That should improve the speed.那应该会提高速度。
I also noticed that your code checks for blob's metadata (and probably that's why you are getting properties of each blob again).我还注意到您的代码会检查 blob 的元数据(这可能就是您再次获取每个 blob 的属性的原因)。 However you can get the metadata of the blob as part of blob listing operation.但是,您可以在 blob 列表操作中获取 blob 的元数据。 You just need to include BlobTraits
when listing blobs.您只需要在列出 blob 时包含BlobTraits
。
So your list blobs code would be something like:因此,您的列表 blob 代码将类似于:
await foreach (BlobItem file in containerClient.GetBlobsAsync(traits: BlobTraits.Metadata, cancellationToken: cancellationToken))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.