简体   繁体   中英

Get multiple messages from Kafka topic

My use case is like from the producer side it will post one line of data(around 100 bytes) as one message to kafka topic, from consumer side I want to consume 5 messages at a time and give it to my consumer logic.

@KafkaListener(id = "listener-batch", topics = "test", containerFactory = "concurrentKafkaListenerContainerFactory")
public void receive(@Payload List<String> messages,
                    @Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
                    @Header(KafkaHeaders.OFFSET) List<Long> offsets) {

    System.out.println("- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -");
    System.out.println("Starting the process to recieve batch messages :: " + messages);
    for (int i = 0; i < messages.size(); i++) {
        System.out.println("received message= "+ messages.get(i) +" with partition-offset= " + partitions.get(i) + "-" + offsets.get(i));
    }
    System.out.println("all the batch messages are consumed");
}

I did a sample example, it always gets one message and printing in the console. Please suggest me any configuration changes required to achive this one.

Please find the source code below.

@EnableKafka
@Configuration
public class KafkaConfig {

@Bean
public ConsumerFactory<String, String> consumerFactory(){
    Map<String, Object> config = new HashMap<>();
    config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
    config.put(ConsumerConfig.GROUP_ID_CONFIG, "batch");
    config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class);
    config.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "5");
    return new DefaultKafkaConsumerFactory<>(config);
}

@Bean
public ConcurrentKafkaListenerContainerFactory concurrentKafkaListenerContainerFactory() {
    ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
    factory.setConsumerFactory(consumerFactory());
    factory.setBatchListener(true);
    return factory;
}
}

Starting producer using below command

./kafka-producer-perf-test --num-records 500 --topic test --throughput 10 --payload-file test.csv --producer-props bootstrap.servers=localhost:9092 key.serializer=org.apache.kafka.common.serialization.StringSerializer value.serializer=org.apache.kafka.common.serialization.StringSerializer

test.csv file contents

Batch-1 message
Batch-2 message
Batch-3 message
Batch-4 message
Batch-5 message
Batch-6 message
Batch-7 message
Batch-8 message
Batch-9 message
Batch-10 message
Batch-11 message

Output is showing like below.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Starting the process to recieve batch messages :: [Batch-3 message]
received message= Batch-3 message with partition-offset= 0-839501
all the batch messages are consumed
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Starting the process to recieve batch messages :: [Batch-7 message]
received message= Batch-7 message with partition-offset= 0-839502
all the batch messages are consumed
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Starting the process to recieve batch messages :: [Batch-3 message]
received message= Batch-3 message with partition-offset= 0-839503
all the batch messages are consumed
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Starting the process to recieve batch messages :: [Batch-1 message]
received message= Batch-1 message with partition-offset= 0-839504
all the batch messages are consumed
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Thanks in advance.

You should configure a Batch Listener , then you can set the max.poll.records property to specify your batch size.

Note that setting this value too low might decrease overall performance, since you'll need to make more polls to the broker to fetch the same amount of records.

The requirement provided here is on the very high level. It would be great if you can tell us your actual requirement from the business logic implementation perspective. You low level coding and other configuration parameters can be fine tuned based your requirement.

For the sake of giving you a suggestion, if you want to just console out messages (5) one by one, then you can poll the 5 records at a time through max.poll.records = 5 and iterate over the consumer records through. It's pretty simple.

Venkata Krishna, This can be done only via using a keying mechanism implemented in the producer side. Each line of input produced from the source system, must be having a unique key associated with it and using a better partitioning strategy to publish the events into a specific partition having a unique key. Based on the unique key, you can group them using one of the available statefull operations or using the one of the Windowing Aggregations. So if you use a window, you can implement something like, group number of events received per key for a given duration and publish them all in a batch to an intermediate topic, and make your consumer to poll <> number records and iterate through the consumer records.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM