简体   繁体   English

Spring Batch并行处理多次执行一个步骤

[英]Spring Batch Parallel Processing executing one step multiple times

I am executing spring batch job in parallel and using SimpleAsyncTaskExecutor for parallel processing with throttle-limit to default (which is 4 by default). 我正在并行执行spring批处理作业,并使用SimpleAsyncTaskExecutor进行并行处理,并且节流阀限制为默认值(默认为4)。 The item reader is reading lines from a text file and then processing. 项目读取器正在从文本文件读取行,然后进行处理。 But what is happeing is one line in text file is getting processed with 4 different threads, making it execting a single chunk 4 times. 但是,令人遗憾的是,文本文件中的一行正在使用4个不同的线程进行处理,使其执行单个块4次。

Below is my batch.xml: 以下是我的batch.xml:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd
        http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
    <import resource="classpath*:/META-INF/spring/batch/override/**/*.xml" />
    <bean id="businessReader" class="com.rbsgbm.rates.eodtasks.batch.reader.BusinessItemReader"/>
    <bean id="businessProcessor" class="com.rbsgbm.rates.eodtasks.batch.processor.BusinessItemProcessor" />
    <bean id="businessWriter" class="com.rbsgbm.rates.eodtasks.batch.writer.BusinessItemWriter" />
    <bean id="deskReader" class="com.rbsgbm.rates.eodtasks.batch.reader.DeskItemReader"/>
    <bean id="deskProcessor" class="com.rbsgbm.rates.eodtasks.batch.processor.DeskItemProcessor" />
    <bean id="deskWriter" class="com.rbsgbm.rates.eodtasks.batch.writer.DeskItemWriter" />
    <bean class="com.rbsgbm.rates.eodtasks.batch.Tasklet.TradeSnapTasklet" id="tradeSnapTasklet"/>
    <bean class="com.rbsgbm.rates.eodtasks.batch.Tasklet.FoundryExtractTasklet" id="foundryExtractTasklet"/>
    <bean id="simpleFireTasklet"
        class="com.rbsgbm.rates.eodtasks.batch.Tasklet.SimpleFireTasklet" />

    <bean id="mdxMarketDataSnapTasklet"
        class="com.rbsgbm.rates.eodtasks.batch.Tasklet.MdxMarketDataSnapTasklet" />

    <bean id="stepListener" class="org.springframework.batch.core.listener.StepExecutionListenerSupport" />
    <bean id="restartJobListener" class="com.rbsgbm.rates.eodtasks.batch.listener.RestartListener"/>
    <bean id="failedStepListener" class="com.rbsgbm.rates.eodtasks.batch.listener.FailedStepStepExecutionListener"/>
    <bean id="taskExecutor"
        class="org.springframework.core.task.SimpleAsyncTaskExecutor">
    </bean>

    <job id="simpleDojJob"  xmlns="http://www.springframework.org/schema/batch">
        <step id="processBusiness" next="simpleFireTask">
            <tasklet>
                <chunk reader="businessReader" processor="businessProcessor"
                    writer="businessWriter" commit-interval="1" />
            </tasklet>

        </step>

        <step id="simpleFireTask" next="foundryTask">
            <tasklet task-executor="taskExecutor">
                <chunk reader="deskReader" processor="deskProcessor"
                    writer="deskWriter" commit-interval="1" />
            </tasklet>

        </step>

        <step id="foundryTask">
            <tasklet ref="foundryExtractTasklet"/>
            <listeners>
                    <listener ref="stepListener"/>
                    <listener ref="restartJobListener"/>
                    <listener ref="failedStepListener"/>
            </listeners>    
        </step>
    </job>
</beans>

If you want to have thread-safe Readers and Writers, you have to implement them this way. 如果要具有线程安全的读取器和写入器,则必须以这种方式实现它们。

Per default, every thread will access the same instance of your reader or writer potentially at the very same moment. 默认情况下,每个线程都可能在同一时刻访问您的读取器或写入器的相同实例。 If your reader and writer is not implemented for that, it will fail to handle it correctly. 如果您的阅读器和书写器未实现此功能,它将无法正确处理。

The most easiest thing to make sure they are thread-safe, is to mark the reader, respectively the writer method as synchronized. 确保它们是线程安全的最简单的方法是将读取器或写入器方法分别标记为同步。

If you cannot change the code of the Reader/Writer, just implement a simple Wrapper and delegate to your Reader/Writer: 如果您无法更改阅读器/编写器的代码,只需实现一个简单的包装器并将其委派给您的阅读器/编写器即可:

public class SynchronizedItemReader<T> implements ItemReader<T>
{
    private ItemReader<T> delegate;
    public void setDelegate(ItemReader<T> delegate) {this.delegate = delegate};

    public synchronized T read() {
        return delegate.read();
    }
}

But note: If you also implement ItemStream to track what has been successfully committed by the writer (and therefore to be able to restart at that position) you need also to manage that, since the chunks can overtake each other. 但请注意:如果您还实现ItemStream来跟踪编写者成功提交的内容(因此能够在该位置重新启动),则还需要管理该内容,因为这些块可能会相互覆盖。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM