Spring中TaskExecutor的实现批处理并行处理

Question

考虑一个 Step bean：

@Bean
  public Step stepForChunkProcessing() {
    return stepBuilderFactory
        .get("stepForChunkProcessing")
        .<Entity1, Entity2>chunk(1000)
        .reader(reader())
        .processor(processor())
        .writer(writer())
        .taskExecutor(taskExecutor())
        .throttleLimit(10)
        .build();
  }
//@formatter:on

  @Bean
  public TaskExecutor taskExecutor(){
      return new SimpleAsyncTaskExecutor("MyApplication");
  }

要求：在 Reader 中，它从文件中读取（Entity1 的）记录。 在处理器中，它处理和在写入器中，它写入数据库。

在 TaskExecutor 之前，只创建了一个线程，它会在 Reader 和 Processor 中循环 1000 次，如上面的块设置中定义的那样。 然后它将转移到 writer 并写入所有 1000 条记录。 同样，它将从记录号 1001 开始，然后在 Reader 和 Processor 中处理另外 1000 条记录。 这是一个同步执行。

在 TaskExecutor 和油门限制为 10 之后，创建了 10 个彼此独立的线程。 他们将如何维护文件中已经被其他线程处理的记录数？ 还要考虑如果我在阅读器的 Read 方法中给出 synchronized 关键字，那么不同的线程怎么会检查文件中已处理的记录？

Answer 1

如参考文档的多线程部分所述，这在多线程环境中是不可能的：

 Many participants in a Step (such as readers and writers) are stateful.
 If the state is not segregated by thread, then those components are not
 usable in a multi-threaded Step

这就是文档提到在AbstractItemCountingItemStreamItemReader#setSaveState的 javadoc 上关闭 state 管理的原因，这里摘录：

Always set it to false if the reader is being used in a concurrent environment.

Spring中TaskExecutor的实现批处理并行处理

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-06-10 08:47:41

Spring中TaskExecutor的实现批处理并行处理

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-06-10 08:47:41

解决方案1
1 已采纳 2020-06-10 08:47:41