简体   繁体   English

Azure Blob 存储读取时间非常慢

[英]Azure Blob storage very slow read time

I am saving JSON data as Block Blobs in Azure Blob Storage - Standard Tier.我将 JSON 数据保存为 Azure Blob 存储 - 标准层中的块 Blob。 The file size is 14.5MB, it contains about 25,000 objects of OLHC data I access the blob from an Azure Function located in the same region.文件大小为 14.5MB,它包含大约 25,000 个 OLHC 数据对象,我从位于同一区域的 Azure Function 访问 blob。 The code simply reads the blob for deserialization, but it takes 20-40 seconds.该代码只是读取 blob 进行反序列化,但需要 20-40 秒。 Is there something I missed?有什么我错过的吗?

    public static async Task<Stream> GetBlob(string ConnectionString, string ContainerName, string Path)
    {
        BlobClient blobClient = new BlobClient(ConnectionString, ContainerName, Path);
        MemoryStream ms = new MemoryStream();

        try
        {
            await blobClient.DownloadToAsync(ms);
            ms.Seek(0, SeekOrigin.Begin);
            return ms;
        } catch (Exception ex)
        {
            ms.Dispose();
            throw;
        }        
    }

And I request the blob in the function我请求 function 中的 blob

        log.LogInformation($"Begin Downloading Blob ");
        using (Stream blob = await Core.Azure.Blob.GetBlob(blobString, "containerName", fileName))
        {
            log.LogInformation($"End Downloading Blob ");
            log.LogInformation($"Begin Reading Blob ");
            using (StreamReader reader = new StreamReader(blob))
            {
                string json = await reader.ReadToEndAsync();
                log.LogInformation($"Begin Deserialize Blob ");
                sticks = JsonConvert.DeserializeObject<List<MyModel>>(json);
                log.LogInformation($"End Deserialize Blob ");
            }
        }
        log.LogInformation($"{symbol} End Get Blob ");

Check Blob Exist Function检查 Blob 是否存在 Function

    public static async Task<bool> CheckExists(string ConnectionString, string ContainerName, string Path)
    {
        BlobClient blobClient = new BlobClient(ConnectionString, ContainerName, Path);
        return await blobClient.ExistsAsync();
    }

This is the result of the timing is up to 47 Seconds这是计时最长 47 秒的结果

I switch to stream reader and JSON Reader and it drops to 10-30 seconds.. but still, that's a very long time我切换到 stream 阅读器和 JSON 阅读器,它下降到 10-30 秒.. 但仍然是很长的时间

I have added the timing here我在这里添加了时间

2021-01-09 23:53:26.656 Begin Downloading Blob 2021-01-09 23:53:26.656 开始下载 Blob
2021-01-09 23:53:30.163 End Downloading Blob 2021-01-09 23:53:30.163 结束下载 Blob
2021-01-09 23:53:30.163 Begin Reading Blob 2021-01-09 23:53:30.163 开始阅读 Blob
2021-01-09 23:53:37.040 Begin Deserialize Blob 2021-01-09 23:53:37.040 开始反序列化 Blob
2021-01-09 23:53:49.737 End Deserialize Blob 2021-01-09 23:53:49.737 结束反序列化 Blob

Another Run另一个运行
OHLCData.Json 14.44 MB (28,000 rows) OHLCData.Json 14.44 MB(28,000 行)

2021-01-10 12:40:49.970 Begin Check Blob Exists 2021-01-10 12:40:49.970 开始检查 Blob 是否存在
2021-01-10 12:40:58.962 End Check Blob Exists 2021-01-10 12:40:58.962 结束检查 Blob 存在
2021-01-10 12:40:58.962 Begin Downloading Blob 2021-01-10 12:40:58.962 开始下载 Blob
2021-01-10 12:41:08.181 End Downloading Blob 2021-01-10 12:41:08.181 结束下载 Blob
2021-01-10 12:41:08.187 Begin Reading Blob 2021-01-10 12:41:08.187 开始阅读 Blob
2021-01-10 12:41:25.713 Begin Deserialize Blob 2021-01-10 12:41:25.713 开始反序列化 Blob
2021-01-10 12:41:33.817 End Deserialize Blob 2021-01-10 12:41:33.817 结束反序列化 Blob
2021-01-10 12:41:33.817 End Get Blob 2021-01-10 12:41:33.817 结束获取 Blob

You are downloading the whole blob into memory stream (unnecessary extra memory kill), converting to string and then deserializing it.您正在将整个 blob 下载到 memory stream (不必要的额外 memory 杀死),转换为字符串然后反序列化它。 I would rather do it directly from blob stream in one shot leveraging the stream support of Newtonsoft.Json like below to optimize speed and memory use. I would rather do it directly from blob stream in one shot leveraging the stream support of Newtonsoft.Json like below to optimize speed and memory use.

BlobClient blobClient = new BlobClient(ConnectionString, ContainerName, Path);
using (var stream = await blobClient.OpenReadAsync())
using (var sr = new StreamReader(stream))
using (var jr = new JsonTextReader(sr))
{
    result = JsonSerializer.CreateDefault().Deserialize<T>(jr);
}

You can also do similar using System.Text.Json APIs.您也可以使用System.Text.Json API 进行类似操作。

JsonSerializerOptions Options = new JsonSerializerOptions();
using (var stream = await blobClient.OpenReadAsync())
{
    result = await JsonSerializer.DeserializeAsync<T>(stream , Options);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM