简体   繁体   中英

Copy table having millions of records from one database to another - Spring Boot + Spring JDBC

Working on a small example where have to copy millions of records from teradata database to Oracle DB.

Environment: Spring Boot + Spring JDBC (jdbcTemplate) + Spring REST + Spring Scheduler + Maven + Oracle + Teradata

Using Spring JDBC's batchUpdate to insert data into target Database Oracle.

Using teradata's 'top 1000' in SQL query from source database.

fecthDataResults = repository.queryForList(
                "select top 1000 A, B, C, D, E from " + schemaName + ".V_X");

Querying from a View "V_X".

This View has 40 million records and spring boot application will choke if it runs.

Also inserting into 2 tables (Primary and Backup) in target Oracle DB.

Whats the best way to fetch and load/copy 40 million records making sure that copying was done successfully into 2 tables.

Spring Scheduler to schedule the batch copy at specified time/interval. Spring Rest to invoke copying manually - both of which is achieved.

Any suggestions would be appreciated.

Thanks.

There are different ways you can solution this:

  1. Logstash Approach - Specify your source and destination data and load the data to both the destination DBs. It has cron support and the logstash can run based on the schedule. It is quite faster. You can specify how many rows you wanna fetch every time.

  2. Use an ETL tool. You can go with any of the open source versions if you do have the ecosystem in place. Talend is a good candidate where you can design your job and export as runnable Jar. You can schedule this by using any component of your choice.

  3. Spring Batch. Please refer this question. Spring RESTful web services - High volume data processing

  4. Spring Cloud Data Flow or Spring boot with a MQ as an intermediate store between your datasources. You may have to introduce message queues to handle failover, fallback mechanisms. Highly reliable and can implemented in a async manner.

My personal opinion is to go with Logstash. If you feel any of the above solutions make sense. I can elaborate them if you want.

well in base of the information you give and following the chosen stack, in my opinion you have two possibilities, first create a project with spring batch in addition with spring batch admin or spring integration to deal with rest. The second in fact uses the first solution in a big data concept using Spring XD I recommend you use a profiler to speed up the performance as much as you could.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM