简体   繁体   English

Azure 数据工厂管道“失败”

[英]Azure Data Factory Pipeline 'On Failure'

I am setting up an ADF pipeline to copy blob into an Azure SQL DB.我正在设置 ADF 管道以将 blob 复制到 Azure SQL DB。 I have a Iteration activity in my pipeline, where I have set up a counter to loop and copy only if the blob exists.我的管道中有一个迭代活动,我在其中设置了一个计数器,仅当 blob 存在时才进行循环和复制。

This works great except for some random PK violations, which I will have to check manually.除了一些随机的PK违规外,这很有效,我必须手动检查。 So I edited my pipeline to log the error, and continue.所以我编辑了我的管道来记录错误,然后继续。 So I set up the pipeline as such.所以我这样设置了管道。 关于 If the copy activity fails due to Primary Key Violation, (for now) ignore, but log the details using a stored procedure and continue as usual ie update the loop counter to get the next folder.如果复制活动由于主键冲突而失败, (暂时)忽略,但使用存储过程记录详细信息并照常继续,即更新循环计数器以获取下一个文件夹。

Unfortunately, the success of Log Failure does not execute the "Set Variable" activity.不幸的是,成功的日志失败并没有执行“设置变量”活动。 So it goes back in an infinite loop keep coming back with the same exception, but the Stored Procedure activity itself is logging the error message correctly.所以它在无限循环中返回,不断返回相同的异常,但存储过程活动本身正在正确记录错误消息。

If I create a new "Set Variable" and do exactly what the SetLoopVariable does, it seems to be okay.如果我创建一个新的“设置变量”并完全按照 SetLoopVariable 执行的操作,似乎没问题。 but that means I have to copy every activity after that to have two separate paths.但这意味着我必须复制此后的每个活动才能有两个单独的路径。 Which I feel is redundant.我觉得这是多余的。

BACKGROUND : My file structure is container/YYYY/MM/dd/HH/mm, there will be at least one file per hour, but not for every minute of the day so I to do a check if the folder exists before attempting to copy.背景:我的文件结构是容器/YYYY/MM/dd/HH/mm,每小时至少有一个文件,但不是一天中的每一分钟,所以我在尝试复制之前检查文件夹是否存在.

This is by design.这是设计使然。 SetVariable will only be called if Copy Data succeeds and fails, since Data Factory V2 Activity Dependencies are a Logical AND .只有在 Copy Data 成功失败时才会调用 SetVariable,因为Data Factory V2 Activity Dependencies 是 Logical AND

Thomas answer is correct.托马斯的回答是正确的。 I had this exact issue recently.我最近有这个确切的问题。 In case it helps someone else, I realised it means the arrows don't represent a flow but a dependency.万一它帮助其他人,我意识到这意味着箭头不代表流,而是依赖。 The box only runs if all the preceding tasks are done, which is impossible in your case because it depends on the copy both succeeding and failing.该框仅在所有前面的任务都完成后运行,这在您的情况下是不可能的,因为它取决于成功和失败的副本。

To solve your case just duplicate the 'set loop variable' in your error handling path.要解决您的问题,只需在错误处理路径中复制“设置循环变量”即可。

However you might then have the same problem that I now have here Azure data factory: Handling inner failure in until/for activity但是,您可能会遇到与我现在在Azure 数据工厂中遇到的问题相同的问题:处理直到/为活动中的内部故障

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM