简体   繁体   中英

Reading different sets of data from same DB table by various java threads

My question is very similar to this old SO question and no satisfactory answer is posted there.

There is a DB table in DB2 for which I am trying to do parallel record reads by two or more separate Java threads provided those threads should read different sets of data ie if THREAD-1 has read first 1000 records , THREADS-2 should not select those records but different 1000 ones if available.

Since threads will be reading different rows, no clash is expected at read time. Connection Object will also not be shared among threads - they will use their own DB Connections.

Database is DB2 and this table is already range partitioned on one of columns.

Number of threads will not be fixed but configurable so we can run with more threads on more powerful machines.

What could be strategies for limiting reader threads to a particular area of table and not to step on each others shoes ie how to logically divide table so threads keep reading on different areas? Can Java process utilize DB2 range partitions?

One solution is to handover different ranges of primary keys to these threads but for that I will have to read whole table at startup and that I am trying to avoid.

One way could be to select ony a portion of the data. Assuming your key is a number (or a number is part of your key) you could use a modulo function (mod) in the WHERE condition - this would be flexible enough for a configurable number of threads.

SELECT... FROM... WHERE mod(key, 3) = 0 

for the first thread of the three and mod(key, 3) = 1 for the second and so on.

If you really focus on partitions you have to query the DB2 system catalog first to retrieve the partition information for your table (SYSCAT.DATAPARTITIONS)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM