简体   繁体   中英

Apache Beam Combine.globally and outputting different type

I have a use case for batching multiple objects to be sent within one API request.

I am using Combine.globally() which forces me to have the same Output type as the Input type.

org.apache.beam.sdk.transforms.Combine.globally(SerializableFunction<Iterable< MyClass >, MyClass > combiner)

I would like to output a different type than my input and still batch my objects - Is there a way to do it?

As Alexey had already menntioned in comments have to use CombineFn . Here the types of the input elements and the output elements can differ.
From the documentation of Advanced combinations using CombineFn

For more complex combine functions, you can define a subclass ofCombineFn. You should use a CombineFn if the combine function requires a more sophisticated accumulator, must perform additional pre- or post-processing, might change the output type, or takes the key into account.

Example:
Declare your FN like below

public class CustomCombineFN extends CombineFn<MyInputClass, CustomCombineFN.Accum, MyOutputClass> {
       public class Accum {
           int sum = 0;
           int count = 0;
       }
       //override other methods and write your combine logic here
      //follow the documentation above mentioned for better understanding
 
   }

And in the pipeline you can do

pipeline.apply(Combine.globally(new CustomCombineFN()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM