简体   繁体   English

Apache Beam / Google数据流 - 错误处理

[英]Apache Beam / Google dataflow - Error handling

I have a pipeline with quite a few steps (just above 15). 我有一个管道,有很多步骤(刚好超过15)。 I want to report failures everytime a DoFn fails. 我想在每次DoFn失败时报告失败。 I started implementing it through TupleTags with code such as : 我开始通过TupleTags实现它,代码如下:

try {
 ... do stuff ...
 c.output(successTag, ...);
} catch (Exception e) {
 c.output(failureTag, new Failure(...));
}

But since my pipeline contains a lot of steps, this make the pipeline definition code quite hard to read / maintain. 但由于我的管道包含很多步骤,这使得管道定义代码很难读取/维护。

Is there a more global way to achieve it ? 是否有更全面的方式来实现它? Something like raising a custom exception which is handled globally at the pipeline level ? 提升在管道级别全局处理的自定义异常之类的东西?

What you are doing is the correct approach to catch errors and output them differently. 您正在做的是捕获错误并以不同方式输出错误的正确方法。 You will need this on each step though. 不过,你会在每一步都需要这个。 You could use a java pattern to reuse it if you prefer. 如果您愿意,可以使用java模式重用它。 Create a base class for all your ParDos and in processElement add the exception handling code. 为所有ParDos创建基类,并在processElement中添加异常处理代码。 Then implement your processElement in a separate function (ie processElementImpl) which you call in processElement. 然后在processElement中调用的单独函数(即processElementImpl)中实现processElement。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM