简体   繁体   中英

Apache Beam / Google dataflow - Error handling

I have a pipeline with quite a few steps (just above 15). I want to report failures everytime a DoFn fails. I started implementing it through TupleTags with code such as :

try {
 ... do stuff ...
 c.output(successTag, ...);
} catch (Exception e) {
 c.output(failureTag, new Failure(...));
}

But since my pipeline contains a lot of steps, this make the pipeline definition code quite hard to read / maintain.

Is there a more global way to achieve it ? Something like raising a custom exception which is handled globally at the pipeline level ?

What you are doing is the correct approach to catch errors and output them differently. You will need this on each step though. You could use a java pattern to reuse it if you prefer. Create a base class for all your ParDos and in processElement add the exception handling code. Then implement your processElement in a separate function (ie processElementImpl) which you call in processElement.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM