简体   繁体   English

使用数据工厂创建一个管道,将活动从天蓝色的Blob存储复制到数据湖存储

[英]create a pipeline using data factory with copy activity from azure blob storage to data lake store

I am trying to create a pipeline using data factory with copy activity from azure blob storage to data lake store. 我正在尝试使用数据工厂创建一条管道,将活动从天蓝色的blob存储复制到数据湖存储。

But while running the pipeline it is showing status failed and getting below error:- 但是在运行管道时,它显示状态失败并出现以下错误:-

Copy activity encountered a user error at Source side: ErrorCode=UserErrorSourceBlobNotExist,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The required Blob is missing. 复制活动在源端遇到用户错误:ErrorCode = UserErrorSourceBlobNotExist,'Type = Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message =缺少所需的Blob。 ContainerName: https://*********, ContainerExist: True, BlobPrefix: , BlobCount: 0.,Source=Microsoft.DataTransfer.ClientLibrary,'. ContainerName:https:// *********,ContainerExist:True,BlobPrefix:,BlobCount:0。,Source = Microsoft.DataTransfer.ClientLibrary,”。

I follow the azure official tutorials to use data factory with copy activity from the azure blob storage to azure lake store. 我遵循azure官方教程 ,将数据工厂用于从azure blob存储到azure Lake存储的复制活动。 It works correctly on my side. 它在我这边正常工作。 We could create a pipeline by using the Azure portal , Visual Studio or powershell . 我们可以使用Azure门户Visual Studiopowershell创建管道。 We could follow the tutorials step by step to do that. 我们可以按照教程逐步进行操作。 The tutorials also supplied the following code. 教程还提供了以下代码。

  • A linked service of type AzureStorage. 类型为AzureStorage的链接服务。
{
  "name": "StorageLinkedService",
  "properties": {
    "type": "AzureStorage",
    "typeProperties": {
      "connectionString": "DefaultEndpointsProtocol=https;AccountName=<accountname>;AccountKey=<accountkey>"
    }
  }
}
  • A linked service of type AzureDataLakeStore. 类型为AzureDataLakeStore的链接服务。
{
    "name": "AzureDataLakeStoreLinkedService",
    "properties": {
        "type": "AzureDataLakeStore",
        "typeProperties": {
            "dataLakeStoreUri": "https://<accountname>.azuredatalakestore.net/webhdfs/v1",
            "servicePrincipalId": "<service principal id>",
            "servicePrincipalKey": "<service principal key>",
            "tenant": "<tenant info, e.g. microsoft.onmicrosoft.com>",
            "subscriptionId": "<subscription of ADLS>",
            "resourceGroupName": "<resource group of ADLS>"
        }
    }
}
  • An input dataset of type AzureBlob. 类型AzureBlob的输入数据集。
{
  "name": "AzureBlobInput",
  "properties": {
    "type": "AzureBlob",
    "linkedServiceName": "StorageLinkedService",
    "typeProperties": {
      "folderPath": "mycontainer/myfolder/yearno={Year}/monthno={Month}/dayno={Day}",
      "partitionedBy": [
        {
          "name": "Year",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "yyyy"
          }
        },
        {
          "name": "Month",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "MM"
          }
        },
        {
          "name": "Day",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "dd"
          }
        },
        {
          "name": "Hour",
          "value": {
            "type": "DateTime",
            "date": "SliceStart",
            "format": "HH"
          }
        }
      ]
    },
    "external": true,
    "availability": {
      "frequency": "Hour",
      "interval": 1
    },
    "policy": {
      "externalData": {
        "retryInterval": "00:01:00",
        "retryTimeout": "00:10:00",
        "maximumRetry": 3
      }
    }
  }
}
  • An output dataset of type AzureDataLakeStore. 类型为AzureDataLakeStore的输出数据集。
{
    "name": "AzureDataLakeStoreOutput",
      "properties": {
        "type": "AzureDataLakeStore",
        "linkedServiceName": "AzureDataLakeStoreLinkedService",
        "typeProperties": {
            "folderPath": "datalake/output/"
        },
        "availability": {
              "frequency": "Hour",
              "interval": 1
        }
      }
}
  • A pipeline with a copy activity that uses BlobSource and AzureDataLakeStoreSink. 具有使用BlobSource和AzureDataLakeStoreSink的复制活动的管道。
{  
    "name":"SamplePipeline",
    "properties":
    {  
        "start":"2014-06-01T18:00:00",
        "end":"2014-06-01T19:00:00",
        "description":"pipeline with copy activity",
        "activities":
        [  
              {
                "name": "AzureBlobtoDataLake",
                "description": "Copy Activity",
                "type": "Copy",
                "inputs": [
                  {
                    "name": "AzureBlobInput"
                  }
                ],
                "outputs": [
                  {
                    "name": "AzureDataLakeStoreOutput"
                  }
                ],
                "typeProperties": {
                    "source": {
                        "type": "BlobSource"
                      },
                      "sink": {
                        "type": "AzureDataLakeStoreSink"
                      }
                },
                   "scheduler": {
                      "frequency": "Hour",
                      "interval": 1
                },
                "policy": {
                      "concurrency": 1,
                      "executionPriorityOrder": "OldestFirst",
                      "retry": 0,
                      "timeout": "01:00:00"
                }
              }
        ]
    }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法使用数据工厂管道将数据从 azure 数据湖 gen2 复制到 azure sql db - Cannot copy data from azure data lake gen2 to azure sql db using data factory pipeline 从SAP Hana复制到Data Lake Store时,Azure数据管道复制活动会丢失列名 - Azure Data Pipeline Copy Activity loses column names when copying from SAP Hana to Data Lake Store 将 XML 压缩文件从 HTTP 链接源复制并提取到 Azure 使用 Z3A5805BC0FZ63F 工厂数据存储的 Blob 存储 - Copy and Extracting Zipped XML files from HTTP Link Source to Azure Blob Storage using Azure Data Factory 如何从Azure数据工厂中的azure blob存储中获取某些数据,并将目标作为SQL数据库执行复制活动? - How to get just certain data from azure blob storage in Azure Data Factory performing copy activity with destination as SQL database? 使用 Azure 数据工厂管道从具有“类似文件夹的结构”的 Blob 存储中获取元数据 - Get metadata from Blob storage with "folder like structure" using Azure Data Factory pipeline 使用Azure数据工厂(ADF)仅从Azure Data Lake存储中复制最新文件 - Copy only the latest file from azure data lake store with Azure Data Factory (ADF) 如何使用 Odata 链接服务将文件从共享点复制到 blob 存储 azure 数据工厂 v2 - How to copy files from sharepoint into blob storage azure data factory v2 using Odata linked service 使用 Azure 数据工厂将数据从 Azure Data Lake 复制到 SnowFlake,无需阶段 - Copy Data from Azure Data Lake to SnowFlake without stage using Azure Data Factory 使用Azure自动化将文件从Azure Data Lake Store复制到Azure存储 - Copy files from Azure Data Lake Store to Azure Storage with Azure Automation 使用Azure Data Factory将数据从SAP BW复制到Azure Data Lake Store - Copying Data from SAP BW to Azure Data Lake Store using Azure Data Factory
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM