简体   繁体   English

spring 批量读取多个文件并写入一个文件

[英]spring batch read from multiple files and write to one file

I have spring batch configuration which reads from multiple files and write mutiple file.我有 spring 批量配置,它从多个文件中读取并写入多个文件。 Is it possible to write to only one file reading from multiple.是否可以只写入一个从多个读取的文件。 Lets say i receive huge XML file, i split XML into small files and use partitioner and read small files parallel.假设我收到了巨大的 XML 文件,我将 XML 拆分为小文件并使用分区程序并并行读取小文件。 But i need to write all the data read from different small xml files to one output file.但我需要将从不同的小 xml 文件读取的所有数据写入一个 output 文件。 Is this possible with spring batch?I know it is possible by making writer synchronized, but i am looking for any other possible way Job configuration这可能与 spring 批处理吗?我知道可以通过使写入器同步,但我正在寻找任何其他可能的方式作业配置

@Bean
    public Job job(final Step parser) {
        return jobBuilderFactory.get("JOB")
                .flow(parser)
                .end()
                .build();
    }

    @Bean
    public Step parser(final Step parserWorker, final Partitioner partitioner) {
        return stepBuilderFactory.get("parser")
                .partitioner("parser", partitioner)
                .step(parserWorker)
                .taskExecutor(taskExecutor())
                .build();
    }

    @Bean
    public Step parserWorker(
            final StaxEventItemReader reader,
            final FlatFileItemWriter<Employee> writer) {
        return stepBuilderFactory.get("parserWorker")
                .<Employee, Employee>chunk(Integer.parseInt(chunkSize))
                .reader(reader)
                .writer(writer)
                .build();
    }

    @Bean
    @StepScope
    public StaxEventItemReader<Employee> reader(final @Value("file:#{stepExecutionContext[file]}") Resource resource) {
        StaxEventItemReader<Employee> staxEventItemReader = new StaxEventItemReader<>();
        staxEventItemReader.setResource(resource);
        staxEventItemReader.setFragmentRootElementName("Employee");
        Jaxb2Marshaller unMarshaller = new Jaxb2Marshaller();
        unMarshaller.setClassesToBeBound(Employee.class);
        staxEventItemReader.setUnmarshaller(unMarshaller);
        return staxEventItemReader;
    }

    @Bean()
    public FlatFileItemWriter<Employee> fileWriter() {
        FlatFileItemWriter<Employee> fileWriter = new FlatFileItemWriter<>();
        fileWriter.setResource(new FileSystemResource("out.csv"));
        EmployeeAggregator lineAggregator = new EmployeeAggregator();
        fileWriter.setLineAggregator(lineAggregator);
        fileWriter.setLineSeparator(EMPTY_STRING);
        fileWriter.setHeaderCallback(new HeaderCallback());
        fileWriter.setFooterCallback(new FooterCallback());
        return innlesFileWriter;
    }

I get error org.springframework.batch.item.ItemStreamException: Output file was not created:我收到错误org.springframework.batch.item.ItemStreamException: Output file was not created:

I have spring batch configuration which reads from multiple files and write mutiple file.我有 spring 批量配置,它从多个文件中读取并写入多个文件。

You can create an additional step that merges the output files.您可以创建一个合并 output 文件的附加步骤。 Since the output file is a flat file, this can be done without any issue (that would be a bit more problematic if the output file was an XML file since you need to deal with XML declaration, headers, etc when merging files). Since the output file is a flat file, this can be done without any issue (that would be a bit more problematic if the output file was an XML file since you need to deal with XML declaration, headers, etc when merging files).

Another technique is to use a staging area (a table, a queue, etc) and add a step that reads from the staging area and write to the final file.另一种技术是使用暂存区(表、队列等)并添加一个从暂存区读取并写入最终文件的步骤。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM