简体   繁体   English

Spring Batch并行Tasklet

[英]Spring batch parallel Tasklet(s)

I am currently building a spring batch application where several steps are executed. 我目前正在构建一个执行多个步骤的Spring Batch应用程序。 Except one, all the steps are simple tasklets (with no reader or writer) and they are responsible for various tasks like copying files, sending requests, starting batch (*.bat) files etc. 除了一个步骤外,所有步骤都是简单的小任务(没有读写程序),它们负责各种任务,例如复制文件,发送请求,启动批处理(* .bat)文件等。

Most of the steps should be executed serially. 大多数步骤应按顺序执行。 In one specific step I want to start X *.bat files which could have a maximum of Y instances. 在一个特定的步骤中,我想启动X *.bat文件,该文件最多可以包含Y个实例。

In an example, lets say I have 10 *.bat files, but I want to have maximum 5 parallel running. 在一个示例中,假设我有10个*.bat文件,但我希望最多运行5个并行文件。 So the first 5 start together, when on of those finishes the next (6th) should start, till all 10 are processed. 因此,前5个一起开始,当其中一个完成时,下一个(第6个)应该开始,直到处理完所有10个。

Of course, when the execution of all 10 is finished, the next step should start (so it is a synchronous execution). 当然,当所有10个执行完成时,下一步应该开始(因此它是一个同步执行)。

The questions: 问题:

  1. is spring batch the correct way to go? 春季批处理是正确的方法吗? (is it a step that should be 10 times executed with different parameters?) (该步骤是否应使用不同的参数执行10次?)
  2. or should I only execute the step once and develop a "thread-controller" that would allow 5 (or Y) maximum threads? 还是应该只执行一次该步骤并开发一个“线程控制器”,最多允许5个(或Y个)线程?

If 1 == true :) I guess I have to work with the taskExecutor, below I have an example, where I start the first step (lets say has to find out how many the X is), after that I have a flowParallel (that simply says, if there are more batches, start the step again), then I made a split to allow the parallel execution (currently only 3 steps, which of course I could add all X with a loop that are limited from "taskExecutor.setMaxPoolSize" , which I find stupid) 如果1 == true :)我想我必须与taskExecutor一起工作,下面有一个例子,我从第一步开始(让我说必须找出X是多少),之后我有一个flowParallel(简而言之,如果有更多的批处理,请重新开始该步骤),然后我进行了拆分以允许并行执行(目前只有3个步骤,当然我可以添加所有X,并添加一个受“ taskExecutor”限制的循环)。 setMaxPoolSize“,我觉得很愚蠢)

Flow flowInit = new  FlowBuilder<Flow>("flowInit")
            .from(stepS1)
            .end();


    Flow flowParallel = new  FlowBuilder<Flow>("flowParallel")
            .start(stepS1Parallel)
            .next(deciderOne)
            .on("thereAreMoreBatchesToExecute")
            .to(stepS1Parallel).end();


    final Flow splitFlow = new FlowBuilder<Flow>("splitFlow")
                    .start(flowParallel)
                    .split(new SimpleAsyncTaskExecutor())
                    .add(flowParallel, flowParallel, flowParallel)
                    .build();

    return jobs.get("dataLoadParallel")
            .start(flowInit)
            .next(splitFlow)
            .next(stepS1)
            .end().build();

So, what I am doing wrong? 所以,我做错了什么? which way should I go? 我应该走哪条路?

If you want tu set a maximum concurrency, then you have to use the setConcurrencyLimit method of the SimpleAsyncTaskExecutor. 如果要设置最大并发性,则必须使用SimpleAsyncTaskExecutor的setConcurrencyLimit方法。

If you want to have several steps running in parallel, you need to instantiate unique steps and unique flows for every step. 如果要并行运行多个步骤,则需要实例化每个步骤的唯一步骤和唯一流程。 In your example above, you start the same instance of a step (stepS1Parallel) inside the same instance of a flow (flowParallel) in parallel. 在上面的示例中,您在流的同一实例(flowParallel)中并行启动步骤的同一实例(stepS1Parallel)。 This means, that the same instance of a step is called with multiple threads and this will definitely screw things up. 这意味着,同一步骤的实例将被多个线程调用,这肯定会搞砸事情。

So, you need to have a loop in which you create one instance of a step together with one instance of a flow for every *.bat file you want to process. 因此,您需要有一个循环,在其中为要处理的每个* .bat文件创建一个步骤实例以及一个流程实例。

HTH 高温超导

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM