简体   繁体   中英

Apache Beam/Dataflow ReShuffle deprecated, what to use instead?

Apache Beam's Reshuffle was marked as deprecated in May 2017 with the note

For internal use only; no backwards compatibility guarantees.

In addition, the DataflowRunner installs a ReshuffleOverrideFactory which I'm unclear of how changes the reshuffling.

Anyway, the JavaDoc doesn't mention what to use instead. How are users supposed do deal with ParDo transforms with high fan out in general and on Dataflow?

You can look at withFanout option in GroupByKey and Combine operation. Here is the link to the Java API - https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/transforms/Combine.Globally.html#withFanout-int-

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM