简体   繁体   中英

Oracle:query optimization

I have two tables tab1 and tab2 . tab1 has 108000 rows and tab2 has 1200000 rows.
Here is sample data
tab1

+-----------------------------------------------------+
|       Low        |         high        | Region_id  |
+-----------------------------------------------------+
|5544220000000000  |   5544225599999999  |     1      |
|5544225500000000  |   5544229999999999  |     2      |
|5511111100000000  |   5511111199999999  |     3      |
+-----------------------------------------------------+    

tab2

+------------------+
|       pan        |
+-------------------
|5544221111111111  |
|5544225524511244  |
|5511111111254577  |
+------------------+ 

So I run a query like this

select t2.pan, t1.region_id from tab2 t2
 join tab1 t1 on t2.pan between t1.low and t1.high;

What I'm trying to do is finding in which range does tab2.pan exist and retrieving it's region_id : Ranges are unique, Meanning that low and high pairs are distinct.
I tried adding indexes, running in parallel but the query is running very slow(about 3 hours). Can anyone suggest something to fasten the query, it can be adding some kind of indexes, or changing data structure or anything else.
I'm running the query against Oracle 11gR2.
UPDATE
From the comments i tested several things
Adding index like (high, low) and adding index (pan) and (high, low, region), Both ways there goes index full scan, i also tried index on(low,high) and index on pan, this way goes index range scan on tab1 and index full scan on tab2, but anyways it seems extremely slow.

If you have no overlaps and each value in tab1 matches exactly one row in tab2 , then I think the best approach is a correlated subquery with the right indexes:

select t.*, t2.region_id
from (select t1.*,
             (select max(t2.low)
              from tab2 t2
              where t2.low <= t.pan
             ) as t2low
      from tab1 t1
     ) t join
     tab2 t2
     on t.t2low = t2.low;

The index that you want is tab2(low, region) . This index should be used very efficiently for the subquery to get the closest low value. The join should then be quite fast as well.

Does this help your performance?

EDIT:

I should note that in the above query, you can test for the high value in the outer join. This should be fine if the the low values are unique, because the join on low will be really fast. So:

select t.*,
       (case when t.pan <= t2.high then t2.region_id end) as region_id
from (select t1.*,
             (select max(t2.low)
              from tab2 t2
              where t2.low <= t.pan
             ) as t2low
      from tab1 t1
     ) t join
     tab2 t2
     on t.t2low = t2.low;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM