简体繁体 English

JMS事务如何与并发使用者一起工作？

[英]How do JMS transactions work with concurrent consumers?

原文 2015-11-10 18:48:03 0 3 java/ jms/ ibm-mq/ messaging

Hey so I have a queue of messages which need sequential processing. 嘿所以我有一个需要顺序处理的消息队列。 Now this processing involves calling a Web service (which might be down sometimes) so I have to make the message retrieval transactional. 现在这个处理涉及调用Web服务（有时可能会关闭），所以我必须使消息检索事务性。 As far as I know, when there's any exception midway, the whole transaction rolls back and the message isn't lost right? 据我所知，当中途有任何异常时，整个交易回滚，消息不会丢失吗？

But what's also needed is high availability on the message consumer, so I have two instances of the listener listening in on my queue. 但是还需要消息使用者的高可用性，所以我有两个监听器实例在我的队列中监听。 Now will the transaction ensure that a second message isn't retrieved by the other instance of the listener until the first one is completely done processing the first message? 现在，事务是否会确保侦听器的另一个实例不会检索第二条消息，直到第一条消息完全处理完第一条消息为止？ Or will I have to do something more to make sure that no message is sent out of the queue until the one before that is fully done processing. 或者我还需要做些什么来确保在完成处理之前没有消息从队列中发出。

If any additional configuration is needed, would it be in the MQ or on the listener? 如果需要任何其他配置，它是在MQ中还是在侦听器上？

I'm using websphere mq as the message broker and spring integration for retrieving the messages. 我正在使用websphere mq作为消息代理和spring集成来检索消息。 Thanks for the help. 谢谢您的帮助。

EDIT: 编辑：

With the token thing the first concern would be high availability on the queue manager itself. 对于令牌事物，第一个问题是队列管理器本身的高可用性。 The queue which holds this token has to be part of some queue manager. 保存此标记的队列必须是某个队列管理器的一部分。 Now if we have a failover, that control queue will no longer be accessible. 现在，如果我们有故障转移，则将无法再访问该控制队列。 Which kinda means that that we need another control queue ready in case of a failover. 哪种方式意味着我们需要在故障转移时准备好另一个控制队列。

We can't have listeners listening in on that DR control queue during normal operations though. 但是，在正常操作期间，我们不能让侦听器侦听该DR控制队列。 (Let's say we have a mechanism to actually make sure that the "data" queue is perfectly replicated). （假设我们有一种机制可以确保“数据”队列完全复制）。 The listener instances should know that a failover has initiated so that it can stop listening to the control queue during normal ops and switch over to the secondary. 侦听器实例应该知道已启动故障转移，以便它可以在正常操作期间停止侦听控制队列并切换到辅助节点。 I can't do this using the listener instance alone. 我不能单独使用侦听器实例。 The actual producer which puts messages into the queue will have to notify the listener instances to stop listening to the normal ops control queue and switch over to the secondary. 将消息放入队列的实际生产者必须通知侦听器实例停止侦听正常的操作控制队列并切换到辅助节点。 This would be kinda tricky if there's any intermediate connection problem (and the normal ops queue manager isn't really down) but that's too much of a corner case. 如果存在任何中间连接问题（并且正常的操作队列管理器并没有真正关闭），这将是有点棘手的，但这是一个极端情况。

With high availability of the control queue taken care of, we kinda have the same problem as the non sharable during low load scenarios. 由于控制队列的高可用性得到了解决，我们在低负载情况下具有与非可共享相同的问题。 Now we have occasional spikes in load but there are slump periods. 现在我们偶尔会出现负载峰值，但是会有萧条时期。 (During the night and stuff). （在夜间和东西）。 This token system is not really reactive right? 这个令牌系统真的没有反应性吗？ It's more of periodic thing. 这更像是周期性的事情。 So let's say we don't get any messages for a few hours. 所以，假设我们几小时内没有收到任何消息。 The listeners will still be constantly checking the queue coz the token message keeps triggering one instance after another. 侦听器仍将不断检查队列，因为令牌消息一直触发一个实例。 Which more or less makes it a poller really. 这或多或少使它真正成为一个投影者。 I might as well have multiple listener instances each polling at like different times of the hour right? 我可能还有多个监听器实例，每个轮询在一小时的不同时间进行轮询吧？ It's not really event driven per se. 它本身并不是事件驱动的。

Third would really be the question of inserting the token message. 第三个真的是插入令牌消息的问题。 During first install or during a failback, we'll have that extra manual step of manually inserting this token (since the token would be lost in failover sometimes) . 在首次安装期间或故障恢复期间，我们将手动插入此令牌的额外手动步骤（因为令牌有时会在故障转移中丢失）。 We can't really make one of the listener instances do it since if a listener instance doesn't find the message it kinda means that some other listener instance has the token. 我们无法真正使其中一个侦听器实例执行此操作，因为如果侦听器实例未找到该消息，则意味着某个其他侦听器实例具有该令牌。 So this logic has to be separate. 所以这个逻辑必须是分开的。 And if we actually put some meaningful info into this token message, it has to be a utility that has to be triggered rather than an insertion through the UI. 如果我们实际上将一些有意义的信息放入此令牌消息中，则必须是必须触发的实用程序，而不是通过UI插入。

I guess the first and third aren't really problems, but just extra overhead which wouldn't be needed if we went for a poller implementation. 我想第一个和第三个不是真正的问题，但只是额外的开销，如果我们去了一个轮询器实现就不需要。 The second one is what's bothering me most. 第二个是困扰我的最重要的一个。

3 个解决方案

You need to be passing tokens. 你需要传递令牌。 Here's how that works: 这是如何工作的：

First, create a second queue and place a single message into it. 首先，创建第二个队列并在其中放置一条消息。 Now start up each program with the following logic. 现在使用以下逻辑启动每个程序。

Get the token message off the token queue under syncpoint using an unlimited or long wait interval and the FAIL_IF_QUIESCING option. 使用无限制或长时间等待间隔以及FAIL_IF_QUIESCING选项从同步点下的令牌队列中获取令牌消息。
Put the token message back on the token queue in the same UOW. 将令牌消息放回到同一UOW中的令牌队列中。
Get the next message off of the application queue under the same UOW. 从同一个UOW下的应用程序队列中获取下一条消息。
Process the application's message normally. 正常处理应用程序的消息。
Commit the UOW. 承诺UOW。

You can use as many app instances as you want. 您可以根据需要使用任意数量的应用实例。 You will see one input handle on each of the two queues for each app instance. 您将在每个应用程序实例的两个队列中的每个队列上看到一个输入句柄。 No app instance will have to handle errors due to exclusive use of a queue. 由于独占使用队列，任何app实例都不必处理错误。

Since there is only one token message and only one app can hold it under syncpoint at a time, only one of the apps can be actively processing an application message. 由于只有一个令牌消息，并且一次只有一个应用程序可以将其保存在同步点下，因此只有一个应用程序可以主动处理应用程序消息。 Since the GET off the app queue is dependent on a successful GET off the token queue, all application messages are processed in strict sequence. 由于关闭应用程序队列的GET取决于令牌队列的成功GET ，因此所有应用程序消息都按严格顺序处理。

Note: The app will process the application messages with as many concurrent threads as there are token messages on the token queue. 注意：应用程序将使用与令牌队列上的令牌消息一样多的并发线程来处理应用程序消息。 If anyone ever adds another token message to the token queue, strict sequence processing is lost. 如果有人曾向令牌队列添加另一个令牌消息，则会丢失严格的序列处理。 For this reason read/write access to that queue must NOT be granted to anyone other than the app service account. 因此，不得向除app app帐户以外的任何人授予对该队列的读/写访问权限。 Also, it is common for that token message to be structured so that the app can recognize it. 此外，该令牌消息的结构通常是应用程序可以识别它。 If a stray unrelated message lands there the app should ignore it and throw a warning. 如果一个混乱的无关消息落在那里，应用程序应该忽略它并发出警告。

You will see a fairly even distribution of messages between the two apps. 您将在两个应用程序之间看到相当均匀的消息分布。 If you use more than two apps you might see a wildly un even distribution because queue handles are managed in a stack. 如果使用两个以上的应用程序，你可能会看到一个疯狂未均匀分布，因为队列句柄在堆栈管理。 As an instance commits its UOW the next instance's handle is at the top of the stack so it gets the next message. 当一个实例提交它的UOW时，下一个实例的句柄位于堆栈的顶部，因此它获取下一条消息。 While it is handling that message the instance that just committed will have it's GET placed on top of the stack. 当它处理该消息时，刚刚提交的实例将把它的GET置于堆栈顶部。 If you have 3 or more listening instances chances are only the first two will see any traffic. 如果您有3个或更多监听实例，则只有前两个会看到任何流量。

This assures that messages are processed off the queue one at a time. 这确保了消息一次一个地从队列中处理。 It does not assure that you won't get dupes. 它不能保证你不会得到欺骗。

If you do everything under syncpoint, no messages will ever be lost. 如果您在同步点下执行所有操作，则不会丢失任何消息。 However there's a scenario in which a message is retrieved and processed, then the COMMIT call fails. 但是，有一种情况是检索和处理消息，然后COMMIT调用失败。 In that case the transaction is rolled back and the same message becomes available again. 在这种情况下，事务将回滚，同一消息再次可用。 If you are using 1-phase commits and not XA the processing for that message will not be rolled back. 如果您使用的是1阶段提交而非XA，则不会回滚该消息的处理。

The nice thing is that the token message will be under syncpoint too and that fixes the issue in which an orphaned client channel takes a while to release the transaction. 好消息是，令牌消息也将处于同步点下，并修复了孤立客户端通道需要一段时间才能释放事务的问题。 A new connection will get messages which are older than the message held under syncpoint by the orphan transaction. 新连接将获得比孤立事务在同步点下保留的消息更旧的消息。 Eventually the channel agent times out releasing the message back to the queue but effectively changing its position to be behind any messages that were processed while it was sequestered. 最终，通道代理超时将消息释放回队列，但有效地将其位置更改为在隔离时处理的任何消息的后面。

In this scenario the token message is also sequestered so after this type of connection loss message processing temporarily stops and waits for the channel agent to time out. 在这种情况下，令牌消息也被隔离，因此在此类连接丢失消息处理暂时停止并等待通道代理超时之后。 If that were ever to happen, just issue a STOP CHANNEL command on the instance with the UOW. 如果发生这种情况，只需在具有UOW的实例上发出STOP CHANNEL命令。

Update based on additional question details specific to this answer 根据此答案特定的其他问题详细信息进行更新

The queue which holds this token has to be part of some queue manager. 保存此标记的队列必须是某个队列管理器的一部分。 Now if we have a failover, that control queue will no longer be accessible. 现在，如果我们有故障转移，则将无法再访问该控制队列。 Which kinda means that that we need another control queue ready in case of a failover. 哪种方式意味着我们需要在故障转移时准备好另一个控制队列。

The token queue is as available or as unavailable as the application queue. 令牌队列与应用程序队列一样可用或不可用。 Only one is needed. 只需要一个。 If the app requires HA then a Multi-Instance QMgr or a hardware HA cluster should be used. 如果应用程序需要HA，则应使用多实例QMgr或硬件HA群集。 These share disk so the QMgr that comes up in the failover is the same one the app has been connected to, just at a different physical location. 这些共享磁盘使故障转移中出现的QMgr与应用程序连接的QMgr相同，只是在不同的物理位置。

If the app needs DR it's possible to replicate the disk under the QMgr's logs and data directories to a DR site. 如果应用程序需要DR，则可以将QMgr的日志和数据目录下的磁盘复制到DR站点。 However, nothing should be listening on those instances while processing is going on in the primary data center. 但是，在主数据中心进行处理时，没有任何内容应该监听这些实例。

The listener instances should know that a failover has initiated so that it can stop listening to the control queue during normal ops and switch over to the secondary. 侦听器实例应该知道已启动故障转移，以便它可以在正常操作期间停止侦听控制队列并切换到辅助节点。 I can't do this using the listener instance alone. 我不能单独使用侦听器实例。

Why not? 为什么不？ WMQ has had reconnectable clients for a really long time and the multi-instance features in v7.0.1 made reconnecting drop-dead simple. WMQ在很长一段时间内都有可重新连接的客户端，而v7.0.1中的多实例功能使得重新连接丢弃简单。 As an admin your job is to make sure that no more than one instance of the app and token (not trigger!) queue are available. 作为管理员，您的工作是确保只有一个应用程序实例和令牌（非触发！）队列可用。 During an outage, the client goes into retry without requiring any application code to drive it. 在中断期间，客户端将重试，无需任何应用程序代码来驱动它。 It finds whichever of the instances is up and connects. 它可以找到任何实例并且连接起来。

The actual producer which puts messages into the queue will have to notify the listener instances to stop listening to the normal ops control queue and switch over to the secondary. 将消息放入队列的实际生产者必须通知侦听器实例停止侦听正常的操作控制队列并切换到辅助节点。

The question was about serialization with concurrent consumers. 问题是与并发消费者的序列化。 This is about a design in which producers and consumers have to rendezvous at a common location. 这是关于生产者和消费者必须在共同位置会合的设计。 Different problem that happens to overlap this one only in that it is complicated somewhat by serialization. 碰巧与此重叠的不同问题只是因为序列化有些复杂。 Ask a different question if you need to explore topologies. 如果您需要探索拓扑，请提出其他问题。

This token system is not really reactive right? 这个令牌系统真的没有反应性吗？ It's more of periodic thing. 这更像是周期性的事情。 So let's say we don't get any messages for a few hours. 所以，假设我们几小时内没有收到任何消息。 The listeners will still be constantly checking the queue coz the token message keeps triggering one instance after another. 侦听器仍将不断检查队列，因为令牌消息一直触发一个实例。

This does not use triggering. 这不使用触发。 It uses a token (not trigger!) message the way a filesystem or database uses a lock in order to facilitate serialization. 它使用令牌（不是触发器！）消息，就像文件系统或数据库使用锁的方式一样，以便于序列化。 Whichever listener gets the token message then does a get with unlimited wait on the application queue. 无论哪个侦听器获取令牌消息，然后在应用程序队列上进行无限制等待。 The other listeners have a get with unlimited wait on the token (not trigger!) queue. 其他侦听器对令牌（不是触发器！）队列进行无限制等待。 Basically, they sit around idle until a message arrives. 基本上，他们闲置着，直到消息到来。 Zero reconnections, zero polls, zero CPU cycles. 零重新连接，零轮询，零CPU周期。 If you need to know they are alive, let them time out on the application queue once in a while. 如果您需要知道它们还活着，请让它们暂时在应用程序队列上超时。 This rolls back their UOW on the token queue which passes the token to another listener. 这会在令牌队列上回滚它们的UOW，令牌队列将令牌传递给另一个侦听器。

Third would really be the question of inserting the token message. 第三个真的是插入令牌消息的问题。 During first install or during a failback, we'll have that extra manual step of manually inserting this token (since the token would be lost in failover sometimes). 在首次安装期间或故障恢复期间，我们将手动插入此令牌的额外手动步骤（因为令牌有时会在故障转移中丢失）。

Why? 为什么？ Do you experience MQ losing persistent messages under syncpoint often? 您是否经常在同步点下遇到MQ丢失持久性消息？ If so you are doing it wrong. 如果是这样，你做错了。 ;-) In a situation with strict serialization requirements there can be only one active instance of a queue. ;-)在具有严格序列化要求的情况下，只能有一个队列的活动实例。 If for some reason there are other instances of the application queue pre-defined other than through disk replication there would be one instance of the token (not trigger!) queue also predefined alongside it and one token (not trigger!) message waiting in each queue. 如果由于某种原因，除了通过磁盘复制之外还有预定义的应用程序队列的其他实例，则会有一个令牌（不是触发！）队列的实例也预定义在它旁边，并且每个队列中都有一个令牌（不是触发！）消息等待队列。 Surely nobody would do such a thing in the face of strict message order requirements, but if they did those queues would surely be get-disabled while not in use. 当然，没有人会在严格的消息顺序要求下做这样的事情，但如果他们这样做，那些队列肯定会在不使用时被禁用。

We can't really make one of the listener instances do it since if a listener instance doesn't find the message it kinda means that some other listener instance has the token. 我们无法真正使其中一个侦听器实例执行此操作，因为如果侦听器实例未找到该消息，则意味着某个其他侦听器实例具有该令牌。

Correct. 正确。 The listeners could check queue depth, transactions, input handles, etc, but it is generally wise to avoid mixing application logic with control logic. 侦听器可以检查队列深度，事务，输入句柄等，但通常明智的做法是避免将应用程序逻辑与控制逻辑混合。

So this logic has to be separate. 所以这个逻辑必须是分开的。 And if we actually put some meaningful info into this token message, it has to be a utility that has to be triggered rather than an insertion through the UI. 如果我们实际上将一些有意义的信息放入此令牌消息中，则必须是必须触发的实用程序，而不是通过UI插入。

Why? 为什么？ Your coders handle structured data in app messages, right? 你的程序员处理应用程序消息中的结构化数据，对吧？ If this is perceived to be significantly more difficult, someone's doing it wrong. 如果认为这更加困难，那么有人做错了。 ;-) Write an instance of a formatted token (not trigger!) message to a queue, then offload that to a file. ;-)将一个格式化令牌（不是触发！）消息的实例写入队列，然后将其卸载到文件中。 When you need to reinitialize the queue, use Q or QLoad to first clear the queue, then load the file into it. 当您需要重新初始化队列时，请使用Q或QLoad首先清除队列，然后将文件加载到队列中。 That utility would be the one to open the queue for exclusive use, check for depth, check for handles, etc. prior to performing its magic. 该实用程序将是在执行其魔法之前打开队列以供独占使用，检查深度，检查句柄等的实用程序。 When I do this for consulting clients I typically define a service that initializes the queue on MQ startup and also provide a feature in the application GUI for the operations and support staff. 当我为咨询客户端执行此操作时，我通常会定义一个在MQ启动时初始化队列的服务，并在应用程序GUI中为操作和支持人员提供一个功能。 So long as the app managing the token (not trigger) queue gets it for exclusive access during these operations it really doesn't matter how it's done or by how many instances of the control app. 只要管理令牌（非触发器）队列的应用程序在这些操作期间获得独占访问权限，它的确如何完成或控制应用程序的实例数量无关紧要。

As a rule I also use the structure in the message to send commands to the listeners. 作为一项规则，我还使用消息中的结构向侦听器发送命令。 There's the real token message, and then there's messages that command the app instances to do things. 有真正的令牌消息，然后有消息指示应用程序实例执行操作。 For example, it's really nice to have a non-transactional "ping" capability. 例如，拥有非事务性“ping”功能真的很棒。 I if I drop more ping messages on the token (not trigger!) queue in a single UOW than I have app instances listening, I am guaranteed to contact all of them. 我是否在单个UOW中删除令牌（不是触发！）队列上的ping消息比我有app实例监听的更多ping消息，我保证会联系所有人。 In this way I can detect zombies. 通过这种方式，我可以检测到僵尸。 Depending on how much instrumentation is required the listeners can react to the ping by providing a status (uptime, messages processed, etc.) in the log, to the console, to an event queue, etc. 根据需要多少检测，监听器可以通过在日志，控制台，事件队列等中提供状态（正常运行时间，处理消息等）来对ping做出反应。

I guess the first and third aren't really problems, but just extra overhead which wouldn't be needed if we went for a poller implementation. 我想第一个和第三个不是真正的问题，但只是额外的开销，如果我们去了一个轮询器实现就不需要。

That's good because this is all pretty standard stuff. 这很好，因为这是非常标准的东西。 The problem lies mainly with the requirements for serialization conflicting with those for HA/DR. 问题主要在于序列化与HA / DR冲突的要求。 What you are looking for is global transactional atomicity to implement a single logical queue across multiple physical locations. 您正在寻找的是跨多个物理位置实现单个逻辑队列的全局事务原子性。 IBM MQ has never attempted to provide that, although the WAS Messaging Engine has. 虽然WAS消息传递引擎有，但IBM MQ从未试图提供这种功能。 The closest MQ comes is to use two MQ appliances with memory-to-memory replication of message and transaction data but that is good to only a few miles before light-speed latency begins to significantly impact throughput. 最接近的MQ是使用两个MQ设备，具有消息和事务数据的内存到内存复制，但是在光速延迟开始显着影响吞吐量之前，这只有几英里。 It doesn't handle your DR needs. 它无法满足您的灾难恢复需求。 In fact, nothing short of synchronous replication does that if you want a zero recovery point at the DR datacenter. 事实上， 没有什么短同步复制的呢，如果你想在DR数据中心零的恢复点。

On your HA question... 关于你的医管局问题......

If you've got two queue receivers reading from a single queue, I don't know any natural way to avoid parallelism between them. 如果你有两个队列接收器从一个队列读取，我不知道任何自然的方法来避免它们之间的并行。

Maybe some vendor-specific feature, but it seems like an odd thing and I wouldn't have any high expectations. 也许某些特定于供应商的功能，但它似乎是一件奇怪的事情，我不会有任何高期望。

If you process the messages under syncpoint, it is true that you can roll back the transactions so putting the processed message back to the input queue in case an error occurs. 如果您在同步点下处理消息，则可以回滚事务，以便在发生错误时将已处理的消息放回输入队列。 If the program ends abnormally, an implicit backout occurs. 如果程序异常结束，则发生隐式退出。

http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.dev.doc/q026800_.htm?lang=en http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.dev.doc/q026800_.htm?lang=en

Transactional processing of messages doesn't prevent two separate consumers from reading messages at the same time. 消息的事务处理不会阻止两个单独的消费者同时读取消息。 There is a property on the queues called Shareability, which can be set to Not Shareable so preventing the queue to be opened by separate consumers at the same time. 队列中有一个名为Shareability的属性，可以将其设置为Not Shareable，以防止队列同时由不同的使用者打开。 You should use this option and prepare your application to retry opening the queue so when the first instance fails, the second instance will open the queue. 您应该使用此选项并准备应用程序以重试打开队列，以便在第一个实例失败时，第二个实例将打开队列。