简体   繁体   中英

Joins on Avro format data using lambda in kStreams

I have two streams:

Stream1: 
[KSTREAM-MAP-0000000004]: 1, {"id": 1, "name": "john", "age": 26}
[KSTREAM-MAP-0000000004]: 2, {"id": 2, "name": "jane", "age": 24}
[KSTREAM-MAP-0000000004]: 3, {"id": 3, "name": "julia", "age": 25}
[KSTREAM-MAP-0000000004]: 4, {"id": 4, "name": "jamie", "age": 22}
[KSTREAM-MAP-0000000004]: 5, {"id": 5, "name": "jenny", "age": 27}

Stream 2:
[KSTREAM-MAP-0000000004]: 1, {"id": 1, "name": "xxx", "age": 26}
[KSTREAM-MAP-0000000004]: 2, {"id": 2, "name": "yyy", "age": 24}
[KSTREAM-MAP-0000000004]: 31, {"id": 3, "name": "zzz", "age": 25}
[KSTREAM-MAP-0000000004]: 41, {"id": 4, "name": "uuu", "age": 22}
[KSTREAM-MAP-0000000004]: 51, {"id": 5, "name": "iii", "age": 27}

Now i want to join two streams and retrieve the stream 1 fields which are not present in stream 2 based on the key.

My excepted output should look like:

3, {"id": 3, "name": "julia", "age": 25}
4, {"id": 4, "name": "jamie", "age": 22}
5, {"id": 5, "name": "jenny", "age": 27}

My schema registry file:

{"namespace": "schema.avro",
 "type": "record",
 "name": "mysql",
 "fields": [
     {"name": "id", "type": "int", "doc" : "id"},
     {"name": "name", "type": "string", "doc" : "name"},
     {"name": "age", "type": "int", "doc" : "age"}
 ]
}

I tried joining in this way:

final Serde<GenericRecord> genericAvroSerde = new GenericAvroSerde();

KStream<Integer,String> joined1 = psql_data.leftJoin(mysql_data,
    (leftValue, rightValue) ->  "psql_data=" + leftValue + ", mysql_data=" + rightValue,
    JoinWindows.of(TimeUnit.MINUTES.toMillis(1)),
    Joined.with(
      Serdes.Integer(),
      genericAvroSerde,
      genericAvroSerde)
);

But i'm getting an exception as:

[ERROR] /home/kafka-connect/confluent-4.1.0/kafka_streaming/src/main/java/com/aail/kafka_stream.java:[140,43] error: no suitable method found for leftJoin(KStream<Integer,mysql>,(leftValue[...]Value,JoinWindows,Joined<Integer,GenericRecord,GenericRecord>)
[ERROR] method KStream.<VO#1,VR#1>leftJoin(KStream<Integer,VO#1>,ValueJoiner<? super mysql,? super VO#1,? extends VR#1>,JoinWindows) is not applicable
[ERROR] (cannot infer type-variable(s) VO#1,VR#1
[ERROR] (actual and formal argument lists differ in length))
[ERROR] method KStream.<VO#2,VR#2>leftJoin(KStream<Integer,VO#2>,ValueJoiner<? super mysql,? super VO#2,? extends VR#2>,JoinWindows,Joined<Integer,mysql,VO#2>) is not applicable
[ERROR] (inferred type does not conform to equality constraint(s)
[ERROR] inferred: GenericRecord
[ERROR] equality constraints(s): GenericRecord,mysql)

I think i need to give my mysql avro file in the left and right values in joined function instead of genericAvroSerde. I tried but i'm not getting that. Can some one please help to perform the join operations.

You need to configure the GenericAvroSerde before you use it:

final Serde<GenericRecord> genericAvroSerde = new GenericAvroSerde();
genericAvroSerde.configure(...);

and pass in configs such that it can find Confluent Schema Registry as described in the docs: https://docs.confluent.io/current/streams/developer-guide/datatypes.html#avro

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM