[英]Difference between using a list or a pcollection
Im building a pipeline in apache beam and I just got curious about this, whats the difference between applying a ptransform to a list and a pcollection, is the performance affected by this or is just that the pcollection is inmutable and is this a bad way to aproach a pipeline with apache beam?我在 apache beam 中构建了一个管道,我只是对此感到好奇,将 ptransform 应用于列表和 pcollection 之间的区别是什么,性能是否受此影响,或者只是 pcollection 是不可变的,这是一个糟糕的方法接近带有 apache 光束的管道?
By definition, a PCollection is a unbounded collection.根据定义,PCollection 是一个无界集合。 Immutable, and unbounded .
不可变的,无限的。
The main difference with a list is mainly the unbounded characteristic and it's especially powerful when you are streaming data (from a large file, or from a unbounded source, like PubSub).与列表的主要区别主要在于无界特性,当您流式传输数据时(来自大文件或来自无界源,如 PubSub),它特别强大。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.