简体   繁体   English

Spring Batch-在读取器处理器和写入器之间传递所有数据

[英]Spring Batch - Pass all data between reader processor and writer

I'm curious how one would manage to pass all available data from the reader down through the pipeline. 我很好奇一个人如何设法将读者的所有可用数据向下传递到管道中。

eg I want the reader to pull all the data in and pass the entire result set down to the processor and the writer. 例如,我希望读取器提取所有数据并将整个结果集传递给处理器和写入器。 The result set is small, I'm not worried about resources. 结果集很小,我不担心资源。 I thought I had implemented this properly by having all of the components (reader, writer, processor) receive and return a collection of the processed item. 我以为我已经通过使所有组件(读取器,写入器,处理器)接收并返回已处理项目的集合来正确实现了此目的。

While the results of the process appears to be fine, what I am seeing is that the job is reading everything in, passing it down through the pipeline and then it returns to the reader, reads everything and passes it down and so on. 尽管该过程的结果看起来不错,但我看到的是该作业正在读取所有内容,将其向下传递到管道中,然后返回给读取器,读取所有内容并将其向下传递,依此类推。

I've considered creating an extra step to read all the data in and pass it down to a subsequent step, but I'm curious if I can do this and how 我已经考虑过创建一个额外的步骤来读取所有数据并将其传递给后续步骤,但是我很好奇我是否可以做到这一点以及如何做到

The job looks like 这份工作看起来像

@Bean
Job job() throws Exception {
    return jobs.get("job").start(step1()).build()
}
@Bean
protected Step step1() throws Exception {
    return steps.get("step1").chunk(10)
    .reader(reader()
    .processor(processor()
    .writer(writer()).build()

//.... // ....

The reader, processor and writer accept and return a List, eg 读者,处理器和作家接受并返回一个列表,例如

class DomainItemProcessor implements ItemProcessor<List<Domain>, List<Domain>>{

You could also implement it as a tasklet. 您也可以将其实现为小任务。 Since you want to process all data at once, you do not really have batch-processing and therefore, the whole restart and failurehandling of a "normal" springbatch step will not be used at all. 由于您想一次处理所有数据,因此您实际上没有批处理功能,因此,完全不会使用“正常” springbatch步骤的整个重新启动和故障处理。

A tasklet like this could look as follows in pseudocode: 像这样的tasklet在伪代码中可能如下所示:

@Component
public class MyTasklet implements Tasklet {

    @Autowired
    private ItemReader<YourType> readerSpringBeanName;

    @Autowired
    private ItemProcessor<List<YourType>,List<YourType>> processorSpringBeanName;

    @Autwired
    private ItemWriter<List<YourType>> writerSpringBeanName;


    RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
        readerSpringBeanName.open(new ExecutionContext());
        writerSpringBeanName.open(new ExecutionContext());

        List<YourType> items = new ArrayList<>();
        YourType readItem = readerSpringBeanName.read();
        while(readItem != null) {
             items.add(readItem);
             readItem = readerSpringBeanName.read();
        }

        writerSpringBeanName.write(processorSpringBeanName.process(items));

        readerSpringBeanName.close();
        writerSpringBeanName.close();
        return RepeatStatus.FINISHED;
    }
}

Moreover, depending on your usecase, there is probably not even the need to define a spring-batch job at all. 而且,根据您的用例,可能甚至根本不需要定义spring-batch作业。

High Level Design for this case will be 这种情况的高级设计将是

  1. Reader will be a custom reader. Reader将是自定义阅读器。 It will return List or a wrapper which contains a list of Domain objects. 它将返回List或包含Domain对象列表的包装器。 The reader will inject a DAO bean which to perform a query and retrieve a list of Domain. 读者将注入一个DAO bean,以执行查询并检索Domain列表。

public class DomainList { private List domains; 公共类DomainList {私有列表域;

  // get/set

} }

public class DomainReader implements ItemReader { 公共类DomainReader实现ItemReader {

@Autowire
private DomainDAO domainDAO;

private List<Domain> domains;

@Override
public DomainList read() throws Exception {
    if (this.domains == null) {
        // TODO: please replace with your business logic.
        this.domains = this.domainDAO.getListofDomains();
        return this.domains;
    }
    else {
        return null;   // to tell Spring Batch the reader has done.
    }
}

} }

  1. Processor and Writer will take DomainList as Input. 处理器和写入器将DomainList作为输入。

Note: Above is pseudocode code. 注意:以上是伪代码。

Thanks, Nghia 谢谢,Nghia

Ok, this might be a little too late. 好的,这可能为时已晚。 But here is my take on the implementation Yes you could make use of itemreader, itemprocessor and itemwriter to do it. 但是,这是我对实现的看法。是的,您可以利用itemreader,itemprocessor和itemwriter来实现。 It maybe a little overkill, but nevertheless it could be done 也许有点矫kill过正,但是仍然可以做到

The main issue (since the job keeps coming back to the reader) I see is there should have been a way to tell spring that all items have been read from the Itemreader and there are no more objects to read. 我看到的主要问题(由于工作不断地归还读者),应该有一种方法可以告诉Spring已经从Itemreader中读取了所有项目,并且没有其他要读取的对象。 To do that you have explicitly return a null when spring tries to read more objects. 为此,您必须在spring尝试读取更多对象时显式返回null。

So this is an example returning List from ItemReader Here the read() method should have a similar implementation 因此,这是一个从ItemReader返回List的示例,在这里read()方法应该具有类似的实现

Leave out the Redis implementation , but here is the gist of it, I declare a variable called - 省略Redis实现,但这是要点,我声明一个变量-

iterateindex 迭代索引

Have the iterateIndex created and initialzied at the start of the Item reader like this I have also included the redisson cache to store the list. 像这样在Item读取器的开头创建并初始化iterateIndex,我还包括了redisson缓存来存储列表。 Again that can be negated 再一次可以否定

    public class XXXConfigItemReader implements 
      ItemStreamReader<FeedbackConfigResponseModel> {

    private int iterateIndex;

    @Autowired
    Environment env;

    @Autowired
    RestTemplateBuilder templateBuilder;



    public DeferralConfigItemReader() {
        this.iterateIndex = 0;

    }

and make sure that the read() returns null when it reaches the list size 并确保read()达到列表大小时返回null

public List<FeedbackConfigResponseModel> read()
            throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
        // TODO Auto-generated method stub
        // Get the config details from db



        List<XXX> feedbackConfigModelList = new ArrayList<>;
            // store all the values from the db or read from a file , read
            //it line by line and marshall that to a list
           // now on the first itemreader call, the iterateindex will not be 
           // equal to the list size and hence the entire list is returned 
           // in the first call  

        if (feedbackConfigModelList == null || this.iterateIndex == feedbackConfigModelList.size()) {
            return null;
        } else {
            // and now we equate the list size and store it in iterateIndex
            // the second call will return null.
            this.iterateIndex = feedbackConfigModelList.size();

            return feedbackConfigModelList;
        }

    }

Hope it helps people who are getting the same issue. 希望它能对遇到同样问题的人们有所帮助。

EDIT: Showing how restTemplateBuilder is being used. 编辑:显示如何使用restTemplateBuilder。 note instead of RestTemplateBuilder you could jut autowire the RestTemplate . 请注意,而不是RestTemplateBuilder的,你可以自动装配突出部分的RestTemplate。 I made use of restTemplateBuilder to have some additionalConfig for my prj needs 我使用restTemplateBuilder来满足我的prj需求

Now this is the complete itemreader implemented using itemstreamreader interface 现在,这是使用itemstreamreader接口实现的完整itemreader

  public class XXXX implements ItemStreamReader<FeedbackConfigResponseModel> {

private int iterateIndex;

@Autowired
Environment env;

@Autowired
RestTemplateBuilder templateBuilder;

@Autowired
RedissonClient redisClient;

public DeferralConfigItemReader() {
    this.iterateIndex = -1;
    this.feedbackConfigModelList = new ArrayList<>();
}

@Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
    // TODO Auto-generated method stub

}

@Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
    // TODO Auto-generated method stub

}

@Override
public void close() throws ItemStreamException {
    // TODO Auto-generated method stub

}


@Override
public FeedbackConfigResponseModel read()
        throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
    // TODO Auto-generated method stub
    String feedbackConfigFetchUrl = null;
    ResponseEntity<FeedbackConfigResponseListModel> respModelEntity = null;
    // if the cache is empty then fetch it from resttemplate
    RList<FeedbackConfigResponseModel> rList = redisClient.getList(AppConstants.CACHE_DBCONFIG_LIST);
    List<FeedbackConfigResponseModel> feedbackConfigModelList = new ArrayList<>();
    FeedbackConfigResponseModel firstDbItem = rList.get(0);
    if (firstDbItem == null) {
        feedbackConfigFetchUrl = this.env.getProperty("restTemplate.default.url") + "/test";
        respModelEntity = templateBuilder.build().getForEntity(feedbackConfigFetchUrl,
                FeedbackConfigResponseListModel.class);
        System.out.println("Response Model from template:" + respModelEntity.getBody());
        feedbackConfigModelList = respModelEntity.getBody() == null ? null
                : respModelEntity.getBody().getFeedbackResponseList();
        rList.addAll(feedbackConfigModelList);
    } else {
        System.out.println("coming inside else");
        feedbackConfigModelList = rList;
    }

    if (feedbackConfigModelList == null || this.iterateIndex == feedbackConfigModelList.size()) {
        return null;
    } else {

        this.iterateIndex++;
        System.out.println("itenrating index"+iterateIndex + feedbackConfigModelList.size());
        return feedbackConfigModelList.get(iterateIndex);
    }

}

} }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM