[英]create a pipeline using data factory with copy activity from azure blob storage to data lake store
I am trying to create a pipeline using data factory with copy activity from azure blob storage to data lake store. 我正在尝试使用数据工厂创建一条管道,将活动从天蓝色的blob存储复制到数据湖存储。
But while running the pipeline it is showing status failed and getting below error:- 但是在运行管道时,它显示状态失败并出现以下错误:-
Copy activity encountered a user error at Source side: ErrorCode=UserErrorSourceBlobNotExist,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The required Blob is missing.
复制活动在源端遇到用户错误:ErrorCode = UserErrorSourceBlobNotExist,'Type = Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message =缺少所需的Blob。 ContainerName: https://*********, ContainerExist: True, BlobPrefix: , BlobCount: 0.,Source=Microsoft.DataTransfer.ClientLibrary,'.
ContainerName:https:// *********,ContainerExist:True,BlobPrefix:,BlobCount:0。,Source = Microsoft.DataTransfer.ClientLibrary,”。
I follow the azure official tutorials to use data factory with copy activity from the azure blob storage to azure lake store. 我遵循azure官方教程 ,将数据工厂用于从azure blob存储到azure Lake存储的复制活动。 It works correctly on my side.
它在我这边正常工作。 We could create a pipeline by using the Azure portal , Visual Studio or powershell .
我们可以使用Azure门户 , Visual Studio或powershell创建管道。 We could follow the tutorials step by step to do that.
我们可以按照教程逐步进行操作。 The tutorials also supplied the following code.
本教程还提供了以下代码。
- A linked service of type AzureStorage.
类型为AzureStorage的链接服务。
{
"name": "StorageLinkedService",
"properties": {
"type": "AzureStorage",
"typeProperties": {
"connectionString": "DefaultEndpointsProtocol=https;AccountName=<accountname>;AccountKey=<accountkey>"
}
}
}
- A linked service of type AzureDataLakeStore.
类型为AzureDataLakeStore的链接服务。
{
"name": "AzureDataLakeStoreLinkedService",
"properties": {
"type": "AzureDataLakeStore",
"typeProperties": {
"dataLakeStoreUri": "https://<accountname>.azuredatalakestore.net/webhdfs/v1",
"servicePrincipalId": "<service principal id>",
"servicePrincipalKey": "<service principal key>",
"tenant": "<tenant info, e.g. microsoft.onmicrosoft.com>",
"subscriptionId": "<subscription of ADLS>",
"resourceGroupName": "<resource group of ADLS>"
}
}
}
- An input dataset of type AzureBlob.
类型AzureBlob的输入数据集。
{
"name": "AzureBlobInput",
"properties": {
"type": "AzureBlob",
"linkedServiceName": "StorageLinkedService",
"typeProperties": {
"folderPath": "mycontainer/myfolder/yearno={Year}/monthno={Month}/dayno={Day}",
"partitionedBy": [
{
"name": "Year",
"value": {
"type": "DateTime",
"date": "SliceStart",
"format": "yyyy"
}
},
{
"name": "Month",
"value": {
"type": "DateTime",
"date": "SliceStart",
"format": "MM"
}
},
{
"name": "Day",
"value": {
"type": "DateTime",
"date": "SliceStart",
"format": "dd"
}
},
{
"name": "Hour",
"value": {
"type": "DateTime",
"date": "SliceStart",
"format": "HH"
}
}
]
},
"external": true,
"availability": {
"frequency": "Hour",
"interval": 1
},
"policy": {
"externalData": {
"retryInterval": "00:01:00",
"retryTimeout": "00:10:00",
"maximumRetry": 3
}
}
}
}
- An output dataset of type AzureDataLakeStore.
类型为AzureDataLakeStore的输出数据集。
{
"name": "AzureDataLakeStoreOutput",
"properties": {
"type": "AzureDataLakeStore",
"linkedServiceName": "AzureDataLakeStoreLinkedService",
"typeProperties": {
"folderPath": "datalake/output/"
},
"availability": {
"frequency": "Hour",
"interval": 1
}
}
}
- A pipeline with a copy activity that uses BlobSource and AzureDataLakeStoreSink.
具有使用BlobSource和AzureDataLakeStoreSink的复制活动的管道。
{
"name":"SamplePipeline",
"properties":
{
"start":"2014-06-01T18:00:00",
"end":"2014-06-01T19:00:00",
"description":"pipeline with copy activity",
"activities":
[
{
"name": "AzureBlobtoDataLake",
"description": "Copy Activity",
"type": "Copy",
"inputs": [
{
"name": "AzureBlobInput"
}
],
"outputs": [
{
"name": "AzureDataLakeStoreOutput"
}
],
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "AzureDataLakeStoreSink"
}
},
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"policy": {
"concurrency": 1,
"executionPriorityOrder": "OldestFirst",
"retry": 0,
"timeout": "01:00:00"
}
}
]
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.