Kafka 消费者 - 暂停从特定 kafka 主题分区轮询事件以将其用作延迟队列

Question

We have a scenario in our system where to kafka topic XYZ User details are published by some other producing application A (different system) and my application B is consuming from that topic.我们的系统中有一个场景，其中 kafka 主题 XYZ 用户详细信息由其他一些生产应用程序 A（不同的系统）发布，而我的应用程序 B 正在使用该主题。

The requirement is application B needs to consume that event 45 minutes after(or any configurable time) it is put in kafka topic XYZ by A (reason for this delay is that another REST api of some system C needs to trigger based on this User details event for particular user to confirm if it has some flag set for that user and that flag can be set at any point in that 45 minutes duration, although it could have been solved if C does not have the capability to publish to kafka or notify us in any way).要求是应用程序 B 需要在 A 将其放入 kafka 主题 XYZ 后 45 分钟（或任何可配置的时间）使用该事件（此延迟的原因是某些系统 C 的另一个 REST api 需要根据此用户详细信息触发）特定用户确认是否为该用户设置了一些标志，并且该标志可以在 45 分钟内的任何时间设置，尽管如果 C 没有能力发布到 kafka 或通知我们，它可以解决以任何方式）。

Our application B is written in spring.我们的应用程序 B 是在 spring 中编写的。

The solution I tried was taking event from Kafka and checking the timestamp of the first event in the queue and if it is already 45 minutes for that event then process it or if it is less than 45 minutes then pause polling kafka container for that amount of time till it reaches 45 minutes using MessageListnerContainer pause() method.我尝试的解决方案是从 Kafka 获取事件并检查队列中第一个事件的时间戳，如果该事件已经 45 分钟，则处理它，或者如果它少于 45 分钟，则暂停轮询 kafka 容器以获得该数量使用MessageListnerContainer pause()方法达到 45 分钟的时间。 Something like below -像下面这样的 -

@KafkaListener(id = "delayed_listener", topics = "test_topic", groupId = "test_group")
        public void delayedConsumer(@Payload  String message,
                                    Acknowledgment acknowledgment) {

            UserDataEvent userDataEvent = null;
            try {
                 userDataEvent = this.mapper.readValue(message, TopicRequest.class);
            } catch (JsonProcessingException e) {
                logger.error("error while parsing message");
            }
            MessageListenerContainer delayedContainer = this.kafkaListenerEndpointRegistry.getListenerContainer("delayed_listener");
            if (userDataEvent.getPublishTime() > 45 minutes) // this will be some configured value
 {
                long sleepTimeForPolling = userDataEvent.getPublishTime() - System.currentTimeMillis();
                // give negative ack to put already polled messages back to kafka topic
                acknowledgment.nack(1000);
                // pause container, and later resume it  
                delayedContainer.pause();
                ScheduledExecutorService scheduledExecutorService = Executors.newScheduledThreadPool(1);
                scheduledExecutorService.schedule(() -> {
                    delayedContainer.resume();
                }, sleepTimeForPolling, TimeUnit.MILLISECONDS);
                return;
            }
            // if message was already 45 minutes old then process it
            this.service.processMessage(userDataEvent);
            acknowledgment.acknowledge();
        }

Though it works for single partition but i am not sure if this is a right approach, any comments on that?虽然它适用于单个分区，但我不确定这是否是正确的方法，对此有何评论？ also i see multiple partitions it will cause problems, as above pause method call will pause the whole container and if one of the partition has old message it will not be consumed if container was paused because of new message in some other partition.我还看到多个分区会导致问题，因为上面的 pause 方法调用将暂停整个容器，如果其中一个分区有旧消息，如果容器因其他分区中的新消息而暂停，则不会使用它。 Can i use this pause logic at partition level somehow?我可以以某种方式在分区级别使用这个暂停逻辑吗？

Any better/recommended solution for achieving this delayed processing after a certain amount of configurable time which I can adopt in this scenario rather than doing what I did above?在一定的可配置时间后实现这种延迟处理的任何更好/推荐的解决方案，我可以在这种情况下采用而不是做我上面做的事情？

Answer 1

Kafka is not really designed for such scenarios. Kafka 并不是真正为此类场景设计的。

One way I could see that technique working would be to set the container concurrency to the same as the number of partitions in the topic so that each partition is processed by a different consumer on a different thread;我可以看到该技术工作的一种方法是将容器并发设置为与主题中的分区数相同，以便每个分区由不同线程上的不同使用者处理； then pause/resume the individual Consumer<?, ?> s instead of the whole container.然后暂停/恢复单个Consumer<?, ?> s 而不是整个容器。

To do that, add the Consumer<?, ?> as an additional parameter;为此，添加Consumer<?, ?>作为附加参数； to resume the consumer, set the idleEventInterval and check the timer in an event listener ( ListenerContainerIdleEvent ).要恢复消费者，请设置idleEventInterval并检查事件侦听ListenerContainerIdleEvent ( ListenerContainerIdleEvent ) 中的计时器。 The Consumer<?, ?> is a property of the event so you can call resume() there. Consumer<?, ?>是事件的一个属性，因此您可以在那里调用resume() 。

Kafka 消费者 - 暂停从特定 kafka 主题分区轮询事件以将其用作延迟队列

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-02-04 14:32:31

Kafka 消费者 - 暂停从特定 kafka 主题分区轮询事件以将其用作延迟队列

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-02-04 14:32:31

解决方案1
1 已采纳 2020-02-04 14:32:31