简体   繁体   English

如何使用 Apache Bean Java 获取具有多个 ParDo 的数据流中的管道状态

[英]How to get the Pipeline status in Dataflow with multiple ParDo using Apache Bean Java

Pipeline contain multiple ParDo functions(Refer below code).Need to send the Failed message to pubsub topic when Dataflow pipeline ParDo function failed.管道包含多个 ParDo 函数(参考下面的代码)。需要在数据流管道 ParDo function 失败时将失败消息发送到 pubsub 主题。 Tried with PipelineResults we are not able to get the status.尝试使用 PipelineResults 我们无法获得状态。 Any centralized logic to implement status when dataflow pipeline failed?当数据流管道失败时,是否有任何集中逻辑来实现状态? Kindly suggest me the idea to resolve the issue.请建议我解决问题的想法。

public class PubMessage {
        public static final Logger LOG = LoggerFactory.getLogger(PubMessage.class.getName());
    public static void main(String[] args) {
        PipelineOption options = PipelineOptionsFactory.fromArgs(args).as(PipelineOption.class);

        Pipeline pipeline = Pipeline.create(options);

        PCollection<String> input = pipeline.apply("Read Dummy File", new ReadDummyFile(options.getDummyFilepath()));

        Publish Pubsub Message to Topic
        input.apply("Pardo", ParDo.of(new msg(options.getPath()))).apply("Publish Pubsub Message",PubsubIO.writeMessages().to(options.getTopic()));

        // Publish second message to Topic      
        String print_topic = options.getTopic();
        input.apply("Pardo", ParDo.of(new msgSecond("Read text").apply("Publish Pubsub Message",PubsubIO.writeMessages().to(options.getTopic()));


        PipelineResult p = pipeline.run();
    if (PipelineResult.State.FAILED.equals(p.waitUntilFinish())) {
        throw new RuntimeException("Pipeline failed for unknown reason");
        // send pubsub msg
    }

msg消息

public class msg extends DoFn<String, PubsubMessage> {
    @ProcessElement
    public void processElement(ProcessContext c) {
        //get the value sending msg to topic
        c.output(message);
    }
}   
    
msgSecond
    
   public class msgSecondextends DoFn<String, PubsubMessage> {
        @ProcessElement
        public void processElement(ProcessContext c) {
            //get the value sending msg to topic
            c.output(message);
        }
    }

          
    
    

A streaming pipeline will keep trying rather than completing in a failed state. You could publish the message from within the DoFn itself.流式传输管道将继续尝试而不是在失败的 state 中完成。您可以从 DoFn 本身内部发布消息。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何转换 PCollection<row> 使用 Java 到数据流 Apache 中的 Integer</row> - How to convert PCollection<Row> to Integer in Dataflow Apache beam using Java 我们可以在 apache-beam 的批处理管道中使用 Windows + GroupBy 或 State &amp; timely 打破 fusion b/w ParDo 吗? - Can we break fusion b/w ParDo using Windows + GroupBy or State & timely in batch pipeline of apache-beam? Apache Beam Python:使用 ParDo 返回条件语句 class - Apache Beam Python: returning conditional statement using ParDo class 如何从 Java 中的 Cloud Function 触发 Cloud Dataflow 管道作业? - How to trigger Cloud Dataflow pipeline job from Cloud Function in Java? Apache Beam 数据流管道使用 Bazel 构建和部署 - Apache Beam Dataflow pipeline build and deploy with Bazel Apache Beam:如何在使用重复数据删除时解决“ParDo 需要确定性密钥编码器才能使用 state 和计时器”function - Apache Beam: How to solve "ParDo requires a deterministic key coder in order to use state and timers" while using Deduplication function 如何为 Apache Beam/Dataflow 经典模板(Python)和数据管道实现 CI/CD 管道 - How to implement a CI/CD pipeline for Apache Beam/Dataflow classic templates (Python) & data pipelines 流式 pubsub -bigtable 使用 apache 光束数据流 java - Streaming pubsub -bigtable using apache beam dataflow java Apache 光束 ParDo 滤波器 Go - Apache Beam ParDo Filter in Go MongoIO Apache 带有 Mongo Upsert 管道示例的光束 GCP 数据流 - MongoIO Apache beam GCP Dataflow with Mongo Upsert Pipeline example
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM