簡體   English   中英

如何增加 blob_client.download_blob() 的 ResponseBodySize (Azure Blob Storage Python SDK)

[英]How to increase the ResponseBodySize of blob_client.download_blob() (Azure Blob Storage Python SDK)

Reviewing some Azure Log Analytics logs and I see that each time my Python Azure Function downloads a blob from Azure Storage, there is an initial 32MB chunk, then all subsequent GetBlob actions are 4MB chunks.

如何增加此數字以減少 Function 的執行時間?

從存儲(Azure 函數)下載 blob 的示例 Python:

def load_blob_to_memory(blob_client):
    blob_data = blob_client.download_blob().readall()
    blob_bytes = io.BytesIO(blob_data)
    return blob_bytes

顯示 ResponseBodySize 的示例 Log Analytics:

  • 詢問:
//==================================================//
// Assign variables
//==================================================//
let varStart = ago(2d);
let varEnd = now();
let varStorageAccount = 'stgtest';
let varIngressContainerName = 'cont-test';
let varFileName = 'test.csv';
let varSep = '/';
let varSampleUploadUri = strcat('https://', varStorageAccount, '.dfs.core.windows.net', varSep, varIngressContainerName, varSep, varFileName);
let varSampleDownloadUri = replace(@'%2F', @'/', replace(@'.dfs.', @'.blob.', tostring(varSampleUploadUri)));
//==================================================//
// Filter table
//==================================================//
StorageBlobLogs
| where TimeGenerated between (varStart .. varEnd)
  and AccountName == varStorageAccount
  //and StatusText == varStatus
  and split(Uri, '?')[0] == varSampleUploadUri
  or split(Uri, '?')[0] == varSampleDownloadUri
| summarize 
  count() by OperationName,
  TimeGenerated,
  UserAgent = tostring(split(UserAgentHeader, '(')[0]),
  FileName = tostring(split(tostring(parse_url(url_decode(Uri))['Path']), '/')[-1]),
  DownloadChunkSize = format_bytes(ResponseBodySize, 2, 'MB'),
  StatusCode,
  StatusText
| order by TimeGenerated asc
  • Output:
6/9/2021, 6:24:22.226 PM    GetBlob azsdk-python-storage-blob/12.8.1 Python/3.8.10  test.csv    32 MB   206 Success 1   
6/9/2021, 6:24:22.442 PM    GetBlob azsdk-python-storage-blob/12.8.1 Python/3.8.10  test.csv    4 MB    206 Success 1   
6/9/2021, 6:24:22.642 PM    GetBlob azsdk-python-storage-blob/12.8.1 Python/3.8.10  test.csv    4 MB    206 Success 1   
6/9/2021, 6:24:22.780 PM    GetBlob azsdk-python-storage-blob/12.8.1 Python/3.8.10  test.csv    4 MB    206 Success 1

BlobClient class 的download_blob()方法有一個max_concurrency 參數,但我不確定它是否需要完全異步/等待代碼重寫。

編輯1:謝謝@Guarav。 這將默認值增加到32MB

def create_blob_client(credentials):
    blob_client = BlobClient.from_blob_url(
                      event.get_json()["blobUrl"], 
                      credentials, 
                      max_single_get_size = 64*1024*1024, 
                      max_chunk_get_size = 32*1024*1024
    )
    return blob_client

請查看BlobClient構造函數的max_single_get_sizemax_chunk_get_size arguments。 您可以調整這兩個以增加在單個請求中下載的數據量。

從文檔中:

max_single_get_size

單個調用中要下載的 blob 的最大大小,超出的部分將以塊的形式下載(可以是並行的)。 默認為 32 1024 1024 或 32MB。

max_chunk_get_size

用於下載 blob 的最大塊大小。 默認為 4 1024 1024 或 4MB。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM