[英]Beam/Dataflow 2.2.0 - extract first n elements from pcollection
Is there any way to extract first n elements in a beam pcollection? 有什么方法可以提取光束集合中的前n个元素? The documentation doesn't seem to indicate any such function.
该文档似乎并未指示任何此类功能。 I think such an operation would require first a global element number assignment and then a filter - would be nice to have this functionality.
我认为这样的操作首先需要分配全局元素编号,然后需要过滤器-拥有此功能会很好。
I use Google DataFlow Java SDK 2.2.0
. 我使用
Google DataFlow Java SDK 2.2.0
。
PCollection's are unordered per se, so the notion of "first N elements" does not exist - however: PCollection本身是无序的,因此不存在“前N个元素”的概念-但是:
In case you need the top N elements by some criterion, you can use the Top transform . 如果您需要通过某些条件来确定前N个元素,则可以使用Top变换 。
In case you need any N elements, you can use Sample . 如果需要任何 N个元素,则可以使用Sample 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.