简体   繁体   中英

Optmize download data from database using java

We are downloading around 10 million data from oracle/mssql database using java.

We are fetching 5000 records at a time and fetching 5000 records takes 5 minutes depending on where condition,so fetching and downloading 10 milion records will take (10million/5000) *5 minutes ie 10000 minutes.

We have tried fetching 100000 records at a time,but it might run into heap space issue.

Is there any way to optimize that?

You will need to determine the number of records you can safely return the query. You will need to take a look at how big is the size of a record on average, RecordSize and you will need to determine MaxSize. The number of records you can load this way is

MaxSize / RecordSize

But you might want to load less records, to avoid issues from records being slightly larger in average than expected:

0.9 * MaxSize / RecordSize

Also, you will need to optimize your query to:

  • not load unnecessary columns
  • make where, having and on clauses quicker

And you could separate your query into a two-step approach: first you run the query with the real conditions and get only the ids, then query the real columns using only the ids as condition. This is especially helpful if you happen to use joins to gather some of the columns and you do not need all the joins when you gather the ids.

Also, you can improve your database by normalizing it if it is not normalized and indexing the columns you are performing your conditions on. However, you need to be careful with the indexing, because although it speeds up reads, it slows down the writes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM