[英]Retry of batch operation in Azure Table
I am trying insert data from Azure data lake store to Azure table through Azure Data Factory.我正在尝试通过 Azure 数据工厂将数据从 Azure 数据湖存储插入 Azure 表。 The data in Azure Data Lake file is in same schema to that of final Azure Table sink.
Azure Data Lake 文件中的数据与最终 Azure 表接收器的架构相同。
The ADF pipeline consist of single copy activity to copy from Azure Data Lake store to Azure Table. ADF 管道包含从 Azure Data Lake 存储复制到 Azure 表的单个复制活动。 But the ADF pipeline is failing at times due to throttling.
但是由于节流,ADF 管道有时会失败。 I cannot afford to rerun the complete pipeline as it takes hours.
我不能重新运行完整的管道,因为它需要几个小时。
I wanted to retry only the failed batch.我只想重试失败的批次。 But I don't see that as as option provided in Azure Table.
但我不认为这是 Azure 表中提供的选项。
I found SinkRetryCount and SinkRetryWait as two parameters for AzureTableSink class, but I guess(since the doc doesn't mention properly) that would be for the complete pipeline.我发现SinkRetryCount和SinkRetryWait作为 AzureTableSink 类的两个参数,但我猜(因为文档没有正确提及)这将用于完整的管道。
I have two questions:我有两个问题:
Have you tried below:您是否尝试过以下方法:
If your process ensures a clean state as the very first step, similar to Command Design pattern's undo (but more naive), then your process can re-execute.如果您的流程在第一步确保干净状态,类似于命令设计模式的撤消(但更幼稚),那么您的流程可以重新执行。
Reference: https://docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-create-pipelines参考: https : //docs.microsoft.com/en-us/azure/data-factory/v1/data-factory-create-pipelines
If ADFv2, then you have more options and can have more complex logic to handle errors:如果是 ADFv2,那么您有更多选择,可以有更复杂的逻辑来处理错误:
for the activity that is failing, wrap this in an until-success loop, and be sure to include a bound on execution.对于失败的活动,将其包装在一个直到成功的循环中,并确保在执行时包含一个界限。
you can add more activities in the loop to handle failure and log, notify, or resolve known failure conditions due to externalities out of your control.您可以在循环中添加更多活动来处理故障并记录、通知或解决由于您无法控制的外部因素而导致的已知故障情况。
You can also use asynchronous communication to future process executions that save success to a central store.您还可以对未来的流程执行使用异步通信,从而将成功保存到中央存储。 Then later executions “if” I already was successful then stop processing before the activity.
然后稍后执行“如果”我已经成功然后在活动之前停止处理。
Check retries at ee retry at https://docs.microsoft.com/en-us/azure/data-factory/data-factory-create-pipelines .在https://docs.microsoft.com/en-us/azure/data-factory/data-factory-create-pipelines上的 ee retry 检查重试。
Retry: Number of retries before the data processing for the slice is marked as Failure. Activity execution for a data slice is retried up to the specified retry count. The retry is done as soon as possible after the failure.
Hope it helps.希望能帮助到你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.