[英]Azure data factory copy activity performance tuning
https://docs.microsoft.com/en-us/azure/data-factory/data-factory-load-sql-data-warehouse 。 根據1000 DWU和polybase的鏈接,我應該獲得200MBps的吞吐量。 但我得到4.66 MBps。 我在xlargerc資源類中添加了用戶,以便從azure sql datawarehouse獲得最佳吞吐量。
以下是Pipeline JSON。
{
"name": "UCBPipeline-Copy",
"properties": {
"description": "pipeline with copy activity",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlDWSink",
"allowPolyBase": true,
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
},
"cloudDataMovementUnits": 4
},
"inputs": [
{
"name": "USBBlob_Concept
}
],
"outputs": [
{
"name": "AzureDW_Concept"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1
},
"scheduler": {
"frequency": "Day",
"interval": 1
},
"name": "AzureBlobtoSQLDW_Concept",
"description": "Copy Activity"
}
],
"start": "2017-02-28T18:00:00Z",
"end": "2017-03-01T19:00:00Z",
"isPaused": false,
"hubName": "sampledf1_hub",
"pipelineMode": "Scheduled"
}
}
輸入數據集:
{
"name": "AzureBlob_Concept",
"properties": {
"published": false,
"type": "AzureBlob",
"linkedServiceName": "AzureZRSStorageLinkedService",
"typeProperties": {
"fileName": "conceptTab.txt",
"folderPath": "source/",
"format": {
"type": "TextFormat",
"columnDelimiter": "\t"
}
},
"availability": {
"frequency": "Day",
"interval": 1
},
"external": true,
"policy": {}
}
}
輸出數據集:
{
"name": "AzureDW_Concept",
"properties": {
"published": false,
"type": "AzureSqlDWTable",
"linkedServiceName": "AzureSqlDWLinkedService",
"typeProperties": {
"tableName": "concept"
},
"availability": {
"frequency": "Day",
"interval": 1
}
}
}
配置中缺少什么?
我看了一下runId“e98ac557-a507-4a6e-8833-978eff1723c3”,它應該屬於你的復制活動。 從我們的服務日志來看,源文件不夠大(在您的情況下為270 MB),因此服務調用延遲會使吞吐量不夠好。 您可以嘗試加載更大的文件以獲得更好的吞吐量。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.