Fastest way to sink JSON to Kafka using FLINK

Question

Code Optimisation

I have a flink application that reads data from a url/port performs processing on it and returns a JSON. I then convert the JSON to a String and sink it to Kafka.

Current Performance & Noted Issue

If I just perform the processing -> I can run about 30,000 strings through the function, however when I add the function to convert it to STring and then sink to kafka My throughput drops to 17,000 strings per second.

Do I need to convert my JSON to String before I sink to Kafka? If not how do I sink a json ObjectNode to kafka?

Else what other solutions are there. I think the bottleneck is the to String Function

I tried converting the JSON to a String using several methods (.toString function, StringBuilder to String).

 // Read from Source
 val in_stream = env.socketTextStream(url, port,      socket_stream_deliminator, socket_connection_retries).setParallelism(1)

 // Perform Process
 .map(x=>{Process(x)}).setParallelism(1)

 // Convert to STring
 .map(x => ObjectNodeToString({
     val json_string_builder = StringBuilder.newBuilder
     json_string_builder.append(x)
     return json_string_builder.toString()
 })).setParallelism(1)

 // sink data
 .addSink(new FlinkKafkaProducer[String](broker_hosts, global_topic, new SimpleStringSchema()))

I would like to mantain the 30,000 strings processing per second. which I do get with out the convert to string function. Can I sink the ObjectNode directly to kafka?

Answer 1

You can. Sink is serializing given objects to a byte array before sending it to kafka. Make sure your sink function is supplied with serializer which is capable to convert ObjectNode to a byte array.

Also make sure that consumer is ready to receive ObjectNode objects, not Strings.

Fastest way to sink JSON to Kafka using FLINK

Question

Code Optimisation

Current Performance & Noted Issue

1 answers

solution1
0 ACCPTED 2019-01-30 11:46:55

Fastest way to sink JSON to Kafka using FLINK

Question

Code Optimisation

Current Performance & Noted Issue

1 answers

solution1 0 ACCPTED 2019-01-30 11:46:55

solution1
0 ACCPTED 2019-01-30 11:46:55