简体   繁体   中英

Azure Data Factory throws 'Length Required" error on copy from SQL to ADLS

I am trying to copy data from on-prem SQL server to Azure Data Lake Storage (ADLS) via Azure Data Factory (ADF). Everything seems to work, except when I run (debug or trigger) the pipeline, I get the error:

{ "errorCode": "2200", "message": "Failure happened on 'Sink' side. ErrorCode=UserErrorAdlsFileWriteFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Writing to 'AzureDataLakeStore' failed. Message: The remote server returned an error: (411) Length Required.. Response details: \\r\\nLength Required\\r\\n\\r\\n

Length Required

\\r\\n

HTTP Error 411. The request must be chunked or have a content length.

\\r\\n\\r\\n,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The remote server returned an error: (411) Length Required.,Source=System,'", "failureType": "UserError", "target": "CopyData1" }

What is really odd, is that the following pipelines DO work:

  • SQL tbl1 -> SQL tbl2
  • ADLS source.txt -> ADLS sink.txt

Ie read/write access works as expected. The latter pipeline is also able to create/overwrite the sink.txt file.

But when I run the pipeline

  • SQL tbl1 -> sink.txt

I get the Length Required error. And if sink.txt exists, the pipeline even deletes it!

I'm using ADFv2, ADLS Gen1, ADF & ADLS resides in the same subscription/resource group, using selfhosted/Azure Integration Runtime (for SQL / ADLS respectively). I have tested with source statement as simple as "SELECT 1 Col". Also tested without dataset schema, and with schemas+mappings.

Is this a bug, or am I missing something? Which “Length” is required?


EDIT 1: Minimal JSON scripts

pipeline1.json

{
    "name": "pipeline1",
    "properties": {
        "activities": [
            {
                "name": "CopyData1",
                "type": "Copy",
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false,
                    "secureInput": false
                },
                "typeProperties": {
                    "source": {
                        "type": "SqlSource",
                        "sqlReaderQuery": "SELECT TOP 1 'x' AS col1 FROM sys.tables"
                    },
                    "sink": {
                        "type": "AzureDataLakeStoreSink"
                    },
                    "enableStaging": false,
                    "dataIntegrationUnits": 0
                },
                "inputs": [
                    {
                        "referenceName": "table1",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "sink1",
                        "type": "DatasetReference"
                    }
                ]
            }
        ]
    }
}

table1.json

{
    "name": "table1",
    "properties": {
        "linkedServiceName": {
            "referenceName": "SqlServer1",
            "type": "LinkedServiceReference"
        },
        "type": "SqlServerTable",
        "typeProperties": {
            "tableName": "sys.tables"
        }
    }
}

sink1.json

{
    "name": "sink1",
    "properties": {
        "linkedServiceName": {
            "referenceName": "AzureDataLakeStore1",
            "type": "LinkedServiceReference"
        },
        "type": "AzureDataLakeStoreFile",
        "structure": [
            {
                "name": "col1",
                "type": "String"
            }
        ],
        "typeProperties": {
            "format": {
                "type": "TextFormat",
                "columnDelimiter": ",",
                "rowDelimiter": "",
                "nullValue": "\\N",
                "treatEmptyAsNull": true,
                "skipLineCount": 0,
                "firstRowAsHeader": true
            },
            "fileName": "sink1.txt",
            "folderPath": "myDir"
        }
    }
}

EDIT 2: Summary of conducted tests

  • SQL -> ADLS Error
  • Oracle -> ADLS Error
  • SQL -> Blob OK
  • Oracle -> Blob OK
  • SQL -> SQL OK
  • ADLS -> ADLS OK
  • AzureSQLDB -> ADLS OK

Does your self-hosted IR has some proxy setting or goes through special network setting? Such error should be caused by the intermediate proxy service when ADF's ADLS connector tried to talk to the ADLS service.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM