简体   繁体   English

在Google Dataflow中实施撤消

[英]Implementing retractions in google dataflow

I read the "The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in MassiveScale, Unbounded, Out of Order Data Processing" paper. 我读了“数据流模型:一种在MassiveScale,无边界,乱序数据处理中平衡正确性,延迟和成本的实用方法”。 Alas, the SDK does not yet expose the accumulating & retracting triggering mode (section 2.3). las,SDK尚未公开累积和缩回触发模式 (第2.3节)。

I was wondering if there was a workaround for getting similar semantics? 我想知道是否有一种变通办法来获取类似的语义?

I have been reading the source and have figured out that StateTag or StateNamespace may be the way i can store the "last emitted value of the window" and hence can be used to calculate the retraction message down the pipeline. 我一直在阅读源代码,并弄清楚StateTag或StateNamespace可能是我存储“窗口的最后发出值”的方式,因此可以用来计算管道中的撤消消息。 Is this the correct path or are there other classes/ways I can/should look at. 这是正确的路径还是我可以/应该查看的其他类/方式?

The upcoming state API is indeed your best bet for emulating retractions. 即将发布的状态API确实是模拟撤消的最佳选择。 Those classes you mentioned are part of the state API, but everything in the com.google.cloud.dataflow.sdk.util is for internal use only; 您提到的这些类是状态API的一部分,但com.google.cloud.dataflow.sdk.util内容仅供内部使用; we technically make no guarantees that the APIs won't change drastically, or even remain unreleased. 从技术上讲,我们不保证API不会发生很大变化,甚至不会发布。 That said, releasing that API is on our roadmap, and I'm hopeful we'll get it released relatively soon. 就是说,发布该API已在我们的路线图上,我希望我们能够尽快发布它。

One thing to keep in mind: all the code downstream of your custom retractions will need to be able to differentiate them from normal records. 要记住的一件事:自定义撤回的所有下游代码都需要能够将它们与正常记录区分开。 This is something we'll do automatically for you once bonafide retraction support is ready, but in the mean time, you'll just need to make sure all the code you write that might receive a retraction knows how to recognize and handle it as such. 一旦真正的撤消支持就绪,我们将自动为您执行此操作,但是与此同时,您只需要确保编写的所有可能会撤消的代码都知道如何识别和处理此类代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM