I have a BigQuery dimension table (which doesn't change much) and a streaming JSON data from PubSub. What I want to do is to query this dimension table, and enrich the data by joining on the incoming data from PubSub, then write those streams of joined data to another BigQuery table.
As I am new to Dataflow/Beam and the concept is still not that clear to me (or at least I have difficulty starting to write the code), I have a number of questions:
ParDo.of(...).withSideInputs(PCollectionView<Map<String, String>> map)
?You need to join two PCollections.
PeriodicImpulse
and your own ParDo
to create a periodically changing input. See here for an example (please note that PeriodicImpulse
transform was added recently). You can combine the data in a ParDo
where PCollection
(1) is the main input and PCollection
(2) is a side input (similar to the example above).
Finally you can stream output to BigQuery using the BigQueryIO.Write transform.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.