简体   繁体   中英

Apache Beam Combine.perKey using a compound key

I want to combine Measurements by type and device name.

public class Measurement implements Serializable {
    private String measurementType;
    private String device;
    private Long ts;
    private Double reading;
}

I am doing an average with Combine.perKey(...) successfully by the type. But I want to have basically a compound key of device and measurementType.

Right now my KvDoFn looks like this:

public class KvByMeasurementType extends DoFn<Measurement, KV<String, Measurement>> implements Serializable {
@DoFn.ProcessElement
    public void processElement(DoFn<Measurement, KV<String, Measurement>>.ProcessContext context) {
        Measurement measurement = context.element();
        context.output(KV.of(measurement.getMeasurementType(), measurement));
    }
}

How to extend it to create a compound key of two values?

You can simply create a new object and make that the key. For example,

public class MyKey implements Serializable {
    private String measurementType;
    private String device;
}

And then generate and output KV s of type MyKey from your KvByMeasurementType .

Also, define a Beam CombineFn that performs the combining based on this key. See here for more information on the Beam's Combine transform.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM