简体   繁体   中英

Beam/Dataflow 2.2.0 - extract first n elements from pcollection

Is there any way to extract first n elements in a beam pcollection? The documentation doesn't seem to indicate any such function. I think such an operation would require first a global element number assignment and then a filter - would be nice to have this functionality.

I use Google DataFlow Java SDK 2.2.0 .

PCollection's are unordered per se, so the notion of "first N elements" does not exist - however:

  • In case you need the top N elements by some criterion, you can use the Top transform .

  • In case you need any N elements, you can use Sample .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM