简体   繁体   English

如何在分布式系统中分配任务?

[英]How to allocate tasks in distributed system?

I have large number of tasks and several worker servers. 我有大量的任务和几个工作服务器。 I want to allocate these tasks to these workers evenly, even if a worker server goes down. 我想将这些任务平均分配给这些工作人员,即使工作人员服务器出现故障也是如此。

My idea is that I split the tasks into several shards and send each shard to MQ. 我的想法是将任务分成几个碎片,然后将每个碎片发送给MQ。 Each server reads a MessageQueue. 每个服务器读取一个MessageQueue。 I want the task to be processed as soon as possible. 我希望尽快处理任务。 But how to deal with the situation that if a server goes down, the tasks in its MessageQueue cannot be consumed in a timely manner? 但是,如果服务器发生故障,则无法及时处理其MessageQueue中的任务,该如何处理?

By -the-way, are there any JAVA frameworks that can help with this situation? 顺便说一句,是否有任何JAVA框架可以帮助解决这种情况?

What you are describing is a cluster with shared message queues. 您所描述的是具有共享消息队列的群集。 As Thomas Timbul said, all the servers should read from the same message queue. 正如Thomas Timbul所说,所有服务器都应从同一消息队列中读取。 If you are using IBM MQ you should ideally install the queue manager on a separate system and have the servers connect so that if one server goes down it does not affect the others. 如果使用的是IBM MQ,则理想情况下应将队列管理器安装在单独的系统上,并使服务器连接,这样一台服务器宕机不会影响其他服务器。

Each server will pull a message off the queue and process it on demand. 每个服务器都会从队列中提取一条消息,并按需处理它。 Using a J2EE server you can specify the number of threads reading the queue (the number of MDBs) on each server. 使用J2EE服务器,可以指定每台服务器上读取队列的线程数(MDB数)。 For example, in WebSphere it is the maxSessions setting on the port listener. 例如,在WebSphere中,它是端口侦听器上的maxSessions设置。

If one server fails while processing a message the transaction manager should roll-back the transaction and the message will go back on the queue to be read by another server. 如果一台服务器在处理消息时发生故障,则事务管理器应回滚该事务,并且该消息将返回到队列中,以供另一台服务器读取。

If servers process messages at different rates, it doesn't matter as each server just pulls messages off the queue when they need them. 如果服务器以不同的速率处理消息,则无所谓,因为每台服务器仅在需要它们时才将消息从队列中拉出。

Be careful with messages that can't be processed as they can cause the queue to be blocked. 请小心无法处理的消息,因为它们可能导致队列被阻塞。 You need to have a retry count and a back-out queue to which bad messages are sent if they exceed the rery count. 您需要有一个重试计数和一个返回队列,如果错误消息超过了重试计数,则向该队列发送错误消息。 These are referred to as "poison" messages and are the subject of other questions on Stackoverflow and elsewhere. 这些被称为“毒药”消息,是Stackoverflow和其他地方上其他问题的主题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM