[英]spring integration kafka listener thread reads multiple partitions when concurrency = partition count
I set up a Spring Integration flow to process a topic having 3 partitions and set the listener container's concurrency to 3. As expected, I see three threads processing batches from all 3 partitions.我设置了一个 Spring Integration 流来处理具有 3 个分区的主题,并将侦听器容器的并发设置为 3。正如预期的那样,我看到三个线程处理来自所有 3 个分区的批处理。 However, I see that in some cases, one of the listener threads may process a single batch containing messages from multiple partitions.
但是,我看到在某些情况下,其中一个侦听器线程可能会处理包含来自多个分区的消息的单个批处理。 My data is partitioned in kafka by an id so that it may be processed concurrently with other ids, but not with the same ids on another thread (which is what I was surprised to observe is happening).
我的数据在 kafka 中由一个 id 进行分区,以便它可以与其他 id 同时处理,但不能在另一个线程上使用相同的 id(这是我惊讶地发现正在发生的事情)。 I thought from reading the docs that each thread would be assigned a partition.
我从阅读文档中认为每个线程都会被分配一个分区。 I'm using a KafkaMessageDrivenChannelAdapter like this:
我正在使用这样的 KafkaMessageDrivenChannelAdapter:
private static final Class<List<MyEvent>> payloadClass = (Class<List<MyEvent>>)(Class) List.class;
public KafkaMessageDrivenChannelAdapterSpec.KafkaMessageDrivenChannelAdapterListenerContainerSpec<String, MyEvent> myChannelAdapterSpec() {
return Kafka.messageDrivenChannelAdapter(tstatEventConsumerFactory(),
KafkaMessageDrivenChannelAdapter.ListenerMode.batch, "my-topic") //3 partitions
.configureListenerContainer(c -> {
c.ackMode(ContainerProperties.AckMode.BATCH);
c.id(_ID);
c.concurrency(3);
RecoveringBatchErrorHandler errorHandler = new RecoveringBatchErrorHandler(
(record, exception) -> log.error("failed to handle record at offset {}: {}",
record.offset(), record.value(), exception),
new FixedBackOff(FixedBackOff.DEFAULT_INTERVAL, 2)
);
c.errorHandler(errorHandler);
});
}
@Bean
public IntegrationFlow myIntegrationFlow() {
return IntegrationFlows.from(myChannelAdapterSpec())
.handle(payloadClass, (payload, headers) -> {
service.performSink(payload);
return null;
})
.get();
}
How do I set this up so that each listener container thread only processes messages from one partition?我该如何设置,以便每个侦听器容器线程只处理来自一个分区的消息?
But is there additionally a way that I can keep from ever getting a batch with messages from multiple partitions, even if a rebalance does occur?
但是,即使确实发生了重新平衡,是否还有一种方法可以防止我收到来自多个分区的消息的批处理?
That's not how consumer group works.这不是消费者群体的工作方式。 If you would like to have a "sticky" consumers, then consider to use a manual assignment.
如果您想拥有“粘性”消费者,请考虑使用手动分配。 See the channel adapter factory based on the
TopicPartitionOffset... topicPartitions
:查看基于
TopicPartitionOffset... topicPartitions
的通道适配器工厂:
/**
* Create an initial
* {@link KafkaMessageDrivenChannelAdapterSpec.KafkaMessageDrivenChannelAdapterListenerContainerSpec}.
* @param consumerFactory the {@link ConsumerFactory}.
* @param listenerMode the {@link KafkaMessageDrivenChannelAdapter.ListenerMode}.
* @param topicPartitions the {@link TopicPartitionOffset} vararg.
* @param <K> the Kafka message key type.
* @param <V> the Kafka message value type.
* @return the KafkaMessageDrivenChannelAdapterSpec.KafkaMessageDrivenChannelAdapterListenerContainerSpec.
*/
public static <K, V>
KafkaMessageDrivenChannelAdapterSpec.KafkaMessageDrivenChannelAdapterListenerContainerSpec<K, V> messageDrivenChannelAdapter(
ConsumerFactory<K, V> consumerFactory,
KafkaMessageDrivenChannelAdapter.ListenerMode listenerMode,
TopicPartitionOffset... topicPartitions) {
Then it is not going to be treated as consumer group and you have to create several channel adapters pointing each to its specific partition.然后它不会被视为消费者组,您必须创建多个通道适配器,将每个通道适配器指向其特定分区。 All of this channel adapters may emit messages to the same
MessageChannel
.所有这些通道适配器都可以向同一个
MessageChannel
发出消息。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.