简体   繁体   English

并行任务批处理的设计方法

[英]Design approaches for batch processing of parallel tasks

I have a batch job that is expected to process around 1k task at a time. 我有一个批处理作业,预计一次处理大约1k个任务。 And each task roughly takes around 12 - 16 minutes on an average. 每个任务平均大约需要12到16分钟。

In current implementation , all tasks are pushed into a blocking queue. 在当前的实现中,所有任务都被推送到阻塞队列中。 There is a thread that pops a task from this queue and processes it. 有一个线程从此队列中弹出任务并对其进行处理。 For task we are using java's executor service for concurrent execution and once all of its sub tasks are processed we mark this task as complete and head to read another task from the queue. 对于任务,我们使用java的执行程序服务进行并发执行,一旦处理了所有子任务,我们就将该任务标记为完成,然后从队列中读取另一个任务。 We cant optimize task processing time since it makes call to native library and are unaware of what it does internally. 我们无法优化任务处理时间,因为它会调用本机库并且不知道其内部功能。

With current implementation we are able to process around 300 task in more than 24 hrs. 通过当前的实施,我们能够在24小时内处理大约300个任务。

I'm looking for appropriate platform or framework that could help to reduce the processing time. 我正在寻找合适的平台或框架,以帮助减少处理时间。

I'm using Java 1.7,OSGI and Apache Karaf as container 我正在使用Java 1.7,OSGI和Apache Karaf作为容器

PS : The task here is breaking down of certain images ranging from 500 MB - 4 GB into small chunks and storing it into jpeg format PS:这里的任务是将500 MB-4 GB的某些图像分解成小块并将其存储为jpeg格式

For horizontal scaling I would use a messaging system. 对于水平缩放,我将使用消息传递系统。 Simply put all the tasks into a JMS queue. 只需将所有任务放入JMS队列即可。 Then start karaf on a cluster of machines and let each listen on the queue. 然后在一组机器上启动karaf,让每个人都在队列中侦听。 JMS will then automatically feed the processes round robin. 然后,JMS将自动将进程循环喂入。 So the load will be distributed. 因此,负载将被分配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM