简体   繁体   中英

How beam.io.ReadFromPubSub output Pcollection is defined in Apache Beam/Dataflow?

Let's say we have simple streaming pipeline in which we read data from PubSub. I am wondering how the output of this step is defined. If we stream 10 messages, one after another, all of those 10 messages will be a member of single Pcollection or maybe those will be 10 Pcollections with single element each?

They will be emitted down the pipeline as 10 individual PCollections, containing the PubSub message as content. See the source code of ReadFromPubSub .

Furthermore, depending on the flag with_attributes and the message published on PubSub, the content of the PCollection does not necessarily be one single element.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM