I have a flink application that reads data from a url/port performs processing on it and returns a JSON. I then convert the JSON to a String and sink it to Kafka.
If I just perform the processing -> I can run about 30,000 strings through the function, however when I add the function to convert it to STring and then sink to kafka My throughput drops to 17,000 strings per second.
Do I need to convert my JSON to String before I sink to Kafka? If not how do I sink a json ObjectNode to kafka?
Else what other solutions are there. I think the bottleneck is the to String Function
I tried converting the JSON to a String using several methods (.toString function, StringBuilder to String).
// Read from Source
val in_stream = env.socketTextStream(url, port, socket_stream_deliminator, socket_connection_retries).setParallelism(1)
// Perform Process
.map(x=>{Process(x)}).setParallelism(1)
// Convert to STring
.map(x => ObjectNodeToString({
val json_string_builder = StringBuilder.newBuilder
json_string_builder.append(x)
return json_string_builder.toString()
})).setParallelism(1)
// sink data
.addSink(new FlinkKafkaProducer[String](broker_hosts, global_topic, new SimpleStringSchema()))
I would like to mantain the 30,000 strings processing per second. which I do get with out the convert to string function. Can I sink the ObjectNode directly to kafka?
You can. Sink is serializing given objects to a byte array before sending it to kafka. Make sure your sink function is supplied with serializer which is capable to convert ObjectNode to a byte array.
Also make sure that consumer is ready to receive ObjectNode objects, not Strings.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.