简体   繁体   English

通过增加 Spring Batch 中的步骤池大小来更改运行步骤的数量

[英]Changing the number of running steps by increasing the step pool size in Spring Batch

This is a compound question regarding how changing size of the thread pool sizes at run-time affects the spring batch run-time system.这是一个关于在运行时更改线程池大小如何影响弹簧批处理运行时系统的复合问题。

To start I would like to do a verbiage clarification: concurrency = # of running steps and parallelism = # threads per step.首先,我想澄清一下:并发 = # 运行步骤数和并行度 = 每步骤 # 线程数。

For a clear understanding of how I am using spring batch to do my processing.为了清楚地了解我如何使用 Spring Batch 进行处理。 Currently I have a large number of files(200+) that are being generated and I am using Spring Batch to transfer the files where each step maps to 1 file.目前我正在生成大量文件(200+),我正在使用 Spring Batch 传输每个步骤映射到 1 个文件的文件。 Everything about the job is dynamic, as in the number of steps and each step's reader and writer is distinct to that step.关于工作的一切都是动态的,就像步骤数一样,每个步骤的读取器和写入器都与该步骤不同。 So no step shares readers or writers.所以没有步骤共享读者或作者。 There is a thread pool dedicated to running the steps concurrently, and then each step has its own thread pool so we can do parallelism per step.有一个线程池专门用于并发运行这些步骤,然后每个步骤都有自己的线程池,因此我们可以在每个步骤中进行并行处理。 When combined with commit interval this gives great throughput and control.当与提交间隔结合使用时,这提供了巨大的吞吐量和控制。

So my questions are:所以我的问题是:

  1. How can I change the number of running steps after the Job has started?作业开始后如何更改运行步骤数?
  2. How can I change the commit interval after a step has started processing?如何在步骤开始处理后更改提交间隔?

So lets consider an example of why I would like to do this and what exactly I mean by change "running steps" and "commit interval".因此,让我们考虑一个示例,说明我为什么要这样做以及更改“运行步骤”和“提交间隔”的确切含义。

Consider the case you have a total of 300 steps to process with a step thread pool size 5. I begin processing and realize that I have more resources to utilize, I would like to change the thread count to say 8. When I actually do this at run-time what I experience is that the thread pool does increase but the number of running steps does not change.考虑一下您总共有 300 个步骤要处理的情况,步骤线程池大小为 5。我开始处理并意识到我有更多资源可以利用,我想将线程数更改为 8。当我实际执行此操作时在运行时,我的经验是线程池确实增加了,但运行步骤的数量没有改变。 Why is that?这是为什么?

Following a similar logic say I have more memory to utilize, I would then like to increase my commit interval at run-time.按照类似的逻辑说我有更多的内存可以利用,然后我想在运行时增加我的提交间隔。 I have not found anything in the StepExecution class that would let me change the commit interval surprisingly.我在 StepExecution 类中没有发现任何可以让我出人意料地更改提交间隔的东西。 Why not?为什么不?

What is interesting is that for parallelism I am able to change the number of running threads by simply increasing that thread pool's size.有趣的是,对于并行性,我可以通过简单地增加线程池的大小来改变运行线程的数量。 From simply changing the number of parallel threads I noticed massive increase in throughput.通过简单地更改并行线程的数量,我注意到吞吐量大幅增加。

If you would like more information I can provide code, and link to the repository.如果您想了解更多信息,我可以提供代码并链接到存储库。

Thank you very much.非常感谢。

While it is possible to make the commit interval and thread pool size configurable and change them at startup time, it is not possible to change them at runtime (ie "in-flight") once the job execution has started.虽然可以使提交间隔和线程池大小可配置并在启动时更改它们,但一旦作业执行开始,就无法在运行时(即“运行中”)更改它们。

Making the commit interval and thread pool size configurable (via application/system properties or passing them as job parameters) will allow you to empirically adapt the values to best utilize your resources without having to recompile/repackage your application.使提交间隔和线程池大小可配置(通过应用程序/系统属性或将它们作为作业参数传递)将允许您根据经验调整值以最好地利用您的资源,而无需重新编译/重新打包您的应用程序。

The runtime dynamism you are looking for is not available by default, but you can always implement the Step interface and use it as part of a Spring Batch job next to other step types provided out-of-the-box by the framework.您正在寻找的运行时动态默认情况下不可用,但您始终可以实现Step接口并将其用作 Spring Batch 作业的一部分,旁边是框架提供的开箱即用的其他步骤类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM