简体   繁体   English

Java 集群环境下的批处理作业

[英]Java Batch job in cluster environment

We have a cluster with 2 JBOSS nodes.我们有一个包含 2 个 JBOSS 节点的集群。 We have a batch job which loads all users details from a active directory to a DB.我们有一个批处理作业,它将所有用户详细信息从活动目录加载到数据库。 This job is run everyday.该作业每天运行。 It was run before in a non clustered environment and hence we designed it as a singleton.它之前在非集群环境中运行,因此我们将其设计为 singleton。 Now we have a clustered environment and I do not know what is best way to achieve the same result.现在我们有一个集群环境,我不知道实现相同结果的最佳方法是什么。 I want batch job to be run only once a day.我希望批处理作业每天只运行一次。 We use spring and hibernate and I looked at Spring batch.我们使用 spring 和 hibernate,我查看了 Spring 批次。 I could not get any concise answer to my question.对于我的问题,我无法得到任何简明的答案。

Can anybody please let me know if you had implemented batch in cluster environment?如果您在集群环境中实施了批处理,有人可以告诉我吗? What would be the best solution in this scenario?在这种情况下,最好的解决方案是什么?

We implemented this by triggering and starting the jobs externally via MQ ( an http request to start the job would work as well).我们通过 MQ 在外部触发和启动作业来实现这一点(启动作业的 http 请求也可以)。 The scheduler puts a message on the queue and even though we have 'n' nodes listening to the queue, one node will receive the message and based on it's contents, start the job.调度程序将一条消息放入队列,即使我们有“n”个节点在监听队列,一个节点也会接收到消息并根据其内容启动作业。 You can do this with HTTP as well.您也可以使用 HTTP 来执行此操作。

The real 'solution' to this is to schedule the batch job 'externally' and not via an internal cron trigger.真正的“解决方案”是“外部”安排批处理作业,而不是通过内部 cron 触发器。 The actual start mechanism is secondary to that.实际的启动机制是次要的。

Consider also https://github.com/willschipp/spring-batch-cluster that features还考虑具有以下特点的https://github.com/willschipp/spring-batch-cluster

  • write-behind for the Batch Job Repository批处理作业存储库的后写
  • HA for batch in a cluster (automatic stop and failover of executing jobs)集群中批处理的 HA(执行作业的自动停止和故障转移)

In general, it's sometimes a good idea to externalize/isolate batch jobs from transactional systems, so they don't interfere with availability or performance.通常,有时将批处理作业与事务系统外部化/隔离是一个好主意,这样它们就不会干扰可用性或性能。 That being said, if there are good reasons for embedding a batch job in a clustered application (simplicity, code reuse, etc.), then aside from solutions already mentioned, ShedLock is a great option.话虽如此,如果有充分的理由将批处理作业嵌入到集群应用程序中(简单性、代码重用等),那么除了已经提到的解决方案之外, ShedLock是一个不错的选择。

Note for whatever its worth, a blog post about batch jobs in clustered environments .不管它的价值如何,请注意一篇关于集群环境中的批处理作业的博客文章。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM