[英]How to process only the failed files in for each activity when the master pipeline is retriggered in ADF
There are two pipelines Master and child.有两条管道 Master 和 child。 In the child pipeline there is a foreach activity which takes files as input and process them in parallel.在子管道中有一个 foreach 活动,它将文件作为输入并并行处理它们。 For instance, there are 4 files, in which 2 files are successfully processed and loaded the data into a table.例如,有4个文件,其中2个文件被成功处理并将数据加载到表中。 Then, 3rd file processing is failed and 4th file processing is successful.然后,第三个文件处理失败,第四个文件处理成功。 Now, when I retrigger the Master pipeline I only want the 3rd file to be processed, not all the 4 files.现在,当我重新触发主管道时,我只希望处理第 3 个文件,而不是所有 4 个文件。 How can we achieve this.我们怎样才能做到这一点。
I have tried below.我在下面尝试过。
To move/delete the file once the processing is completed
But as per the requirement, I should not move/delete the file.但根据要求,我不应该移动/删除文件。 Could someone please assist.有人可以帮忙吗?
I create a test and succefuly achieve that.我创建了一个测试并成功实现了它。 My overall idea is: use the Lookup activity to extract the copied file names array from the sql table, and then do a Filter operation with the source file names array.我的总体思路是:使用 Lookup 活动从 sql 表中提取复制的文件名数组,然后对源文件名数组进行过滤操作。 If the file name already exists in the sql table, the file copy activity will not be performed.如果 sql 表中已经存在文件名,则不会执行文件复制活动。 It needs us to add file name to the sql table in Copy activity via Aditional columns .它需要我们通过附加列将文件名添加到复制活动中的sql表中。
In my sql table, it looks like as follows:在我的 sql 表中,如下所示:
I declared 3 variables.我声明了 3 个变量。 arr1
Array type variable stores source file names. arr1
数组类型变量存储源文件名。 filterArray
Array type variable stores copied file names array from the sql table. filterArray
数组类型变量存储从 sql 表复制的文件名数组。
At lookup activity, we can use this query select distinct FileName from [dbo].[emp]
to get copied file names array from the sql table.在查找活动中,我们可以使用此查询select distinct FileName from [dbo].[emp]
从 sql 表中获取复制的文件名数组。
Assign the value to the variable filterArray
.将值分配给变量filterArray
。
I set the default value ["emp.csv","emp2.csv","emp3.csv","emp4.csv"]
as source file names to the variable arr1
.我将默认值["emp.csv","emp2.csv","emp3.csv","emp4.csv"]
设置为变量arr1
的源文件名。
At Foreah acivity, we can foreach the variable arr1
.在 Foreah 活动中,我们可以 foreach 变量arr1
。
Inside Foreach activity, assign the value @item()
to the variable arrItem
.在 Foreach 活动中,将值@item()
分配给变量arrItem
。
Then do Filter operation.然后进行过滤操作。 Items: @variables('filterArr')
, Condition: @contains(item().FileName,variables('arrItem'))
, This item()
here represents each element in the filterArray
array. Items: @variables('filterArr')
,Condition: @contains(item().FileName,variables('arrItem'))
,这里的item()
代表filterArray
数组中的每个元素。
At If condition activity, use @empty(activity('Filter1').output.Value)
to determine whether this file has been copied.在 If 条件活动中,使用@empty(activity('Filter1').output.Value)
来判断这个文件是否被复制。
In Ture activity, key in dynamic content @item()
, this represents the name of the file to be copied.在 Ture 活动中,键入动态内容@item()
,这表示要复制的文件的名称。
That's all.就这样。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.