简体   繁体   English

Spring批处理分区

[英]Spring Batch partitioning

We are developing an application which need to read million records from a Table A and group it as sub groups (Table B) and master groups (Table C). 我们正在开发一个应用程序,该应用程序需要从表A中读取数百万条记录并将其分为子组(表B)和主组(表C)。 We are using spring batch to do that. 我们正在使用spring batch来做到这一点。 The problem is that grouping of records is based on the data in the Table A and there would be one master group for the each unique grouping criteria, and there would be one sub group for each 1000 record, if they fall into same master group. 问题在于,记录分组是基于表A中的数据,并且每个唯一的分组条件都会有一个主组,如果每个1000条记录属于同一主组,那么每个1000条记录就会有一个子组。

So the structure looks like this. 所以结构看起来像这样。

Records,Table A-->Subgroup, Table B (for each 1000 records belongs to a unique grouping criteria)-->Master Group, Table C-->(for a unique grouping criteria) 记录,表A->子组,表B(每1000条记录属于一个唯一的分组标准)->主组,表C->(对于一个唯一的分组标准)

If I do in a non partitioned step, concept is fine. 如果我采取非分区步骤,那么概念很好。 But when I partition the step, how is it possible for individual partitions to know that the count has reached 1000, and a new subgroup need to be created. 但是,当我对步骤进行分区时,单个分区如何知道计数已达到1000,并且需要创建一个新的子组。 ?

Any better ideas to solve this problem is also appreciated. 解决该问题的任何更好的主意也将受到赞赏。

I believe that partitioning needs to be separated batch job. 我认为分区需要分开的批处理作业。

I would not go for anything complicated as 2-phase-commit or implementing custom registry table where you store a row for every partition id along with the count of rows that the partition contains. 对于两阶段提交或实现自定义注册表表,我不会进行任何复杂的工作,在该表中,您将为每个分区ID存储一行以及该分区包含的行数。

Also you can use Spring-Batch ItemReaders-ItemWriters to implement some global trigger mechanism in java. 您也可以使用Spring-Batch ItemReaders-ItemWriters在Java中实现一些全局触发器机制。 It would store a map of partition IDs and counts and when some count reaches 1000 some global java task will be triggered. 它会存储分区ID和计数的映射,当计数达到1000时,将触发某些全局Java任务。 The advantage of this method over implementing something similar in DB is performance. 与在数据库中实现类似功能相比,此方法的优势在于性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM