简体   繁体   中英

Spring Batch Multithreaded DB Reader

If I understand correctly spring-batch's reader mechanics doesn't provide a mechanism for multithreading in the reader step. I've been playing around with some ideas around using modular arithmetic on the primary key of a database as a partitioning mechanism for multithreading the query. My questions are two fold:

(1) Have I missed something in the ability to run multiple threads during the reader step, particularly in regards to making database queries?

(2) If I come up with a good solution would it be worth opening a Jira for this and submitting it back to the spring-batch codebase? Clearly https://github.com/spring-projects/spring-batch/blob/master/CONTRIBUTING.md would be the starting place, but the Spring guys seem to not have an apparent mailing list for communication. So I figured that I would ask the question before opening a ticket.

This can be done easily by adding a column called STATUS to your table to track the status of the records that are processed. Initially when you load data to your table, set the status as 'NOT PROCESSED' and when your ItemReader reads the chunk of records set the status to 'IN PROGRESS'. Once your ItemProcessor or ItemWriter completes its processing, change the status from 'IN PROGRESS' to 'PROCESSED'. Make sure to make the method which fetches the data from the database as 'synchronized'. This will make sure multiple threads not to fetch the same data from database.

public List<DomainObject> read(){
  return fetchDataFromDb();
}

private synchronized List<DomainObject> fetchProductAssociationData(){
  //read your chunk-size of records from database which has status as 'NOT PROCESSED' 
  and change the status of the data which is read to 'IN PROGRESS'
  return list;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM