简体   繁体   English

activemq - 等待消耗所有消息

[英]activemq - wait for all messages to be consumed

I have a case where there's a bulk action that is processing multiple items. 我有一个案例,其中有一个处理多个项目的批量操作。 After all items are processed I have to update the action status to completed. 处理完所有项目后,我必须将操作状态更新为已完成。 Items are processed in parallel by multiple consumers. 项目由多个消费者并行处理。

In theory a consumer after processing an item could check if there are no items left (or no messages in the queue for this action), but it's possible, that two consumers (A and B) finish at the same time, they both check at the same time and they both see that the other one is still not ready (because the transaction is not yet committed) - consumer A will not see changes done by consumer B and consumer B will not see changes done by consumer A, so none of them would update action status. 理论上,消费者在处理项目后可以检查是否没有剩余项目(或者此行动的队列中没有消息),但是有可能两个消费者(A和B)同时完成,他们都检查同时他们都看到另一个还没有准备好(因为交易尚未提交) - 消费者A不会看到消费者B做出的改变而消费者B不会看到消费者A做出的改变,所以没有他们会更新行动状态。 Am I right? 我对吗?

How to implement such condition without some kind of additional periodic check of the status and without its overhead? 如何在没有对状态进行某种额外的定期检查而没有其开销的情况下实现这样的条件? Periodic check might be good if there are thousands of items per action, but if there are usually 1-2 long-running ones it's very inefficient. 如果每个动作有数千个项目,则定期检查可能会很好,但如果通常有1-2个长期运行的项目,则效率非常低。

Thanks! 谢谢!

edit: in short - what is the correct approach to trigger some action after processing a set of messages, but: 编辑:简而言之 - 在处理一组消息后触发某些操作的正确方法是什么,但是:

  • messages must be processed in parallel 消息必须并行处理
  • periodic checking if all messages were processed is not the answer 定期检查是否所有消息都被处理不是答案

You can have each consumer enqueue a are-we-done-yet message onto a new queue when it finishes processing (and before it commits its own transaction). 您可以让每个消费者在完成处理时(以及在提交自己的事务之前)将一个已完成的消息排入新队列。 The are-we-done-yet messages should go into a queue with a single consumer; 我们已完成的消息应该与单个消费者一起进入队列; when this consumer processes a message it checks to see if the original queue is empty. 当此使用者处理消息时,它会检查原始队列是否为空。 This has the effect of serializing the check, and resolving the issue that was originally caused by the parallelism. 这具有序列化检查和解决最初由并行性引起的问题的效果。

This isn't the most efficient approach in the general case, but since you mentioned that you have just 1-2 long running items it may work for you. 在一般情况下,这不是最有效的方法,但由于您提到您只有1-2个长时间运行的项目,因此它可能适合您。 I've done this before in a similar situation and it works quite well. 我之前在类似的情况下做过这件事并且效果很好。

You want a first class batching facility. 你想要一流的配料设施。 Relying simply on the size of the queue is not reliable enough. 仅仅依靠队列的大小是不够可靠的。 For example, you can have a process working on a message and then reject it, thus placing message back on the queue (which was previously "empty"). 例如,您可以让一个进程处理消息然后拒绝它,从而将消息放回队列(以前是“空”)。

Rather, make the batch a first class concept. 相反,使批处理成为一流的概念。 Consider sending a "batch start" message that contains the number of items in the batch. 考虑发送包含批次中项目数的“批量启动”消息。 Then as messages are processed, they can update a batch status record, or some other device. 然后,在处理消息时,他们可以更新批处理状态记录或某些其他设备。 The batch status can track the number of messages processed, number that passed, number that failed, etc. 批处理状态可以跟踪处理的消息数,传递的数量,失败的数量等。

When the last message is processed, it can check the batch status to see if it's the "last message" by seeing that the messages processed count matches the batch count "minus 1" (since it's running the last message). 当处理完最后一条消息时,它可以通过查看处理的消息计数与批处理计数“减1”(因为它正在运行最后一条消息)来检查批处理状态以查看它是否是“最后一条消息”。

You'll want to make this process atomic, so for example, if you're using SQL, you'll watch to fetch batch status row "FOR UPDATE", which will lock the row to your transaction and thus your comparison can be atomic. 你想要使这个过程成为原子,所以例如,如果你正在使用SQL,你会看到获取批处理状态行“FOR UPDATE”,它会将行锁定到你的事务中,因此你的比较可以是原子的。

You could also put a trigger on the row, and have it check, if that's more your style. 你也可以在行上放一个触发器,并检查一下,如果这更符合你的风格。

Or you could have a global object on your system that manages this for you. 或者,您可以在系统上拥有一个全局对象来管理这个对象。 All sorts of mechanisms. 各种机制。

But the key is that you have some overarching batch concept to manage all of the workers. 但关键是你有一些总体批量概念来管理所有工人。 You can't do this at the individual worker level, not reliably. 您不能在单个工作人员级别执行此操作,而不是可靠。

As a combination of Will and Dan's answers, I'd suggest a batch administration queue where "Batch Start" messages with a batch size counter arrive, together with "Message Processed" messages, sent by the consumers when they're done processing a message. 作为Will和Dan的答案的组合,我建议一个批处理管理队列,其中带有批量大小计数器的“批量启动”消息与消息者在处理完消息后发送的“消息处理”消息一起到达。

Its single administration consumer can count the processed messages as they arrive until they match the batch size, and log that the batch is done. 其单一管理使用者可以在处理的消息到达时对其进行计数,直到它们与批处理大小匹配,并记录批处理完成情况。

To allow for error situations, you have to do a periodic check. 要允许错误情况,您必须定期检查。

For example, suppose you have two consumers and one message in a queue. 例如,假设队列中有两个使用者和一个消息。 Consumer 1 picks up and starts to process the message. 消费者1选择并开始处理消息。 Now consumer 1 crashes unexpectedly and the transaction is rolled back. 现在,消费者1意外崩溃并且事务被回滚。 The message now needs picking up and processing by consumer 2. 消息现在需要由消费者2进行提取和处理。

Therefore consumer 2 can't exit until all the messages have been successfully processed. 因此,在成功处理完所有消息之前,消费者2无法退出。 The only way of checking this is to check the queue size periodically until it is empty. 检查这个的唯一方法是定期检查队列大小,直到它为空。 If consumer 2 just exits when there are no more messages for it, you will end up with unprocessed messages in the queue in consumer 1 has to rollback a transaction. 如果消费者2刚刚在没有消息的情况下退出,那么消费者1中的队列中的未处理消息最终将回滚事务。

Create an ActionMonitor responsible for marking the Action as finished. 创建一个ActionMonitor,负责将Action标记为已完成。 The different ActionConsumer instances will notify it when they are done. 不同的ActionConsumer实例将在完成后通知它。 When the number of consumers finished is the same as the number of consumers that were running, the ActionMonitor marks the Action as finished. 当消费者的数量完成相同正在运行,消费者的数量,ActionMonitor标记为完成动作

With this solution, there's no need to add any extra queue or thread. 使用此解决方案,无需添加任何额外的队列或线程。 The actual execution of marking the Action as finished will be performed by the same thread that consumed the last element. Action标记为已完成的实际执行将由消耗最后一个元素的同一线程执行。

It would like like this: 它想像这样:

public void ActionMonitor {
    private int numberOfConsumers; // Total number of consumers.
    private int numberOfConsumersFinished;

    public synchronized void consumerFinished() { // Sync could be more efficient.
        numberOfConsumersFinished++;
        if(numberOfConsumers == numberOfConsumersFinished) {
            markTheActionAsFinished();
        }
    }
}

public void ActionConsumer {

    private ActionMonitor actionMonitor;

    public void processElementsInAction() {
        while(moreElementsToProcess()) {
            takeNewElementAndProcessIt();
        }
        actionMonitor.consumerFinished();
    }
}

Warning: You need to know how many Consumers will be in advance. 警告:您需要提前知道有多少消费者。

I hope it helps. 我希望它有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM