简体   繁体   中英

Spring-batch flow / split after a step

I am building a spring-batch solution that contains the following process:

step 1 : split a list into multiple lists step 2 : process each sub-list step 3 : merge sub-lists

The generated sub-lists can be processed in parallel, and according to the spring-batch documentation this is supported. Sadly I can only find spring-batch example jobs that start with parallel steps, not examples that start out sequentially.

The following job will not compile. Spring gives me an error: 'cannot resolve step2'

<batch:job id="webServiceJob2">
    <batch:step id="step1" next="step2"></batch:step>
    <batch:split id="step2" next="step3"></batch:split>
    <batch:step id="step3"></batch:step>
</batch:job>

So how can I configure a job to first run a single step, than run a number of steps in parallel, and then run a last single step?

I've stumbled upon this question asking about how split works, and maybe this answer arrives a bit (one year) late, but here I go...

The issue there is "split" is not a step by itself, but you were naming (and referencing) it as it was:

<batch:job id="webServiceJob2">
    <batch:step id="step1" next="step2"></batch:step>
    <batch:split id="step2" next="step3"></batch:split> <!-- This is not a step -->
    <batch:step id="step3"></batch:step>
</batch:job>

The correct syntax would be:

<batch:job id="webServiceJob2">
    <batch:step id="step1" next="step2"></batch:step>
    <batch:split id="split_step2" next="step3">
        <flow> 
             <step id="step2_A_1" ... next="step2_A_2"/>
             <step id="step2_A_2" ... />
        </flow>
        <flow> 
             <step id="step2_B_1" ... />
        </flow>
    </batch:split>
    <batch:step id="step3"></batch:step>
</batch:job>

But this is not what you want to achieve, because by split declarations you have to set in compile time the exact number of parallel steps that will be executed, and the purpose of split is using different steps in each flow instead calling several times the same one.

You should check the documentation about Scaling and Parallel processes , the partition step seems a good candidate for your requirements.

Of course you can have a split in the middle of a job! Here is the example from Spring Batch In Action (2012).

<batch:job id="importProductsJob">
  <batch:step id="decompress" next="readWrite">
    <batch:tasklet ref="decompressTasklet"/>
  </batch:step>
  <batch:split id="readWrite" next="moveProcessedFiles">
    <batch:flow>
      <batch:step id="readWriteBookProduct"/>
    </batch:flow>
    <batch:flow>
      <batch:step id="readWriteMobileProduct"/>
    </batch:flow>
  </batch:split>
  <batch:step id="moveProcessedFiles">
    <batch:tasklet ref="moveProcessedFilesTasklet" />
  </batch:step>
</batch:job>

Parallel steps would indicate a different step for each sub-list, which I don't think is what you want.
A single Multi-threaded Step seems more appropriate.
As documented, you start by defining a TaskExecutor bean, which will process each chunk in a separate thread. Since TaskExecutors are fairly simple to use, you could also invoke the TaskExecutor on your own. In this case, your step can be multi-threaded without Spring Batch needing to know about it.

Doing something like below should hopefully help you:

<job id="job">
    <step id="step_0" next="split_1">
        <tasklet ref="taskletStep_4"/>
    </step>
    <split id="split_1" next="step_5" task-executor="taskExecutor"> 
        <flow>
            <step id="step_1" next="step_2">
                <tasklet ref="taskletStep_1"/>
            </step>
            <step id="step_2" next="step_3">
                <tasklet ref="taskletStep_2"/>
            </step>
            <step id="step_3">
                <tasklet ref="taskletStep_3"/>
            </step>
        </flow>
        <flow>
            <step id="step_4">
                <tasklet ref="taskletStep_4"/>
            </step>
        </flow>
    </split>
    <step id="step_5">
        <tasklet ref="taskletStep_5"/>
    </step>
</job>

<beans:bean id="taskletStep_1" class="com.test.batch.parallelstep.step.SimpleStep1" />
<beans:bean id="taskletStep_2" class="com.test.batch.parallelstep.step.SimpleStep2" />
<beans:bean id="taskletStep_3" class="com.test.batch.parallelstep.step.SimpleStep3" />
<beans:bean id="taskletStep_4" class="com.test.batch.parallelstep.step.SimpleStep4" />
<beans:bean id="taskletStep_5" class="com.test.batch.parallelstep.step.SimpleStep5" />

<beans:bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor" />

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM