简体   繁体   中英

create a pipeline using data factory with copy activity from azure blob storage to data lake store

I am trying to create a pipeline using data factory with copy activity from azure blob storage to data lake store.

But while running the pipeline it is showing status failed and getting below error:-

Copy activity encountered a user error at Source side: ErrorCode=UserErrorSourceBlobNotExist,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The required Blob is missing. ContainerName: https://*********, ContainerExist: True, BlobPrefix: , BlobCount: 0.,Source=Microsoft.DataTransfer.ClientLibrary,'.

I follow the azure official tutorials to use data factory with copy activity from the azure blob storage to azure lake store. It works correctly on my side. We could create a pipeline by using the Azure portal , Visual Studio or powershell . We could follow the tutorials step by step to do that. The tutorials also supplied the following code.

  • A linked service of type AzureStorage.
  "name": "StorageLinkedService",
  "properties": {
    "type": "AzureStorage",
    "typeProperties": {
      "connectionString": "DefaultEndpointsProtocol=https;AccountName=<accountname>;AccountKey=<accountkey>"
  • A linked service of type AzureDataLakeStore.
    "name": "AzureDataLakeStoreLinkedService",
    "properties": {
        "type": "AzureDataLakeStore",
        "typeProperties": {
            "dataLakeStoreUri": "https://<accountname>.azuredatalakestore.net/webhdfs/v1",
            "servicePrincipalId": "<service principal id>",
            "servicePrincipalKey": "<service principal key>",
            "tenant": "<tenant info, e.g. microsoft.onmicrosoft.com>",
            "subscriptionId": "<subscription of ADLS>",
            "resourceGroupName": "<resource group of ADLS>"
  • An input dataset of type AzureBlob.
  "name": "AzureBlobInput",
  "properties": {
    "type": "AzureBlob",
    "linkedServiceName": "StorageLinkedService",
    "typeProperties": {
      "folderPath": "mycontainer/myfolder/yearno={Year}/monthno={Month}/dayno={Day}",
      "partitionedBy": [
          "name": "Year",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "yyyy"
          "name": "Month",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "MM"
          "name": "Day",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "dd"
          "name": "Hour",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "HH"
    "external": true,
    "availability": {
      "frequency": "Hour",
      "interval": 1
    "policy": {
      "externalData": {
        "retryInterval": "00:01:00",
        "retryTimeout": "00:10:00",
        "maximumRetry": 3
  • An output dataset of type AzureDataLakeStore.
    "name": "AzureDataLakeStoreOutput",
      "properties": {
        "type": "AzureDataLakeStore",
        "linkedServiceName": "AzureDataLakeStoreLinkedService",
        "typeProperties": {
            "folderPath": "datalake/output/"
        "availability": {
              "frequency": "Hour",
              "interval": 1
  • A pipeline with a copy activity that uses BlobSource and AzureDataLakeStoreSink.
        "description":"pipeline with copy activity",
                "name": "AzureBlobtoDataLake",
                "description": "Copy Activity",
                "type": "Copy",
                "inputs": [
                    "name": "AzureBlobInput"
                "outputs": [
                    "name": "AzureDataLakeStoreOutput"
                "typeProperties": {
                    "source": {
                        "type": "BlobSource"
                      "sink": {
                        "type": "AzureDataLakeStoreSink"
                   "scheduler": {
                      "frequency": "Hour",
                      "interval": 1
                "policy": {
                      "concurrency": 1,
                      "executionPriorityOrder": "OldestFirst",
                      "retry": 0,
                      "timeout": "01:00:00"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM