简体   繁体   中英

Write to a dynamic PubSub topic from a Dataflow job based on message content

I would like to route dynamically the different elements of a PCollection to different PubSub topics, based on the content of a field. The topics are not persistent but it is assumed that they exist when PubSubIO.Write() is executed at runtime. So Dataflow should only infer their names at runtime on a per-message basis.

The feature exists for BigQuery and dynamic table names : https://beam.apache.org/documentation/sdks/javadoc/2.0.0/org/apache/beam/sdk/io/gcp/bigquery/DynamicDestinations.html

Is there a way to do something similar with PubSubIO ?

Maybe not based on the message content but on an attribute ? https://beam.apache.org/documentation/sdks/javadoc/0.6.0/org/apache/beam/sdk/io/PubsubIO.PubsubMessage.html#getAttribute-java.lang.String-

Is there a way to do something similar with PubSubIO ?

There is no equivalent to DynamicDestinations for Pub/Sub.

You'll need to know all of the Pub/Sub topics ahead of time and have them defined in the pipeline. The pipeline can be partitioned based on some value or attribute of the Pub/Sub message and routed to the appropriate Pub/Sub topic. The Partition transform will examine the PubsubMessage and determine which partition the message belongs.

Reference: Partition

Maybe not based on the message content but on an attribute ?

Yes, you can access a message's attributes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM