简体   繁体   English

Oracle:查询优化

[英]Oracle:query optimization

I have two tables tab1 and tab2 . 我有两个表tab1tab2 tab1 has 108000 rows and tab2 has 1200000 rows. tab1有108000行,而tab2有1200000行。
Here is sample data 这是样本数据
tab1

+-----------------------------------------------------+
|       Low        |         high        | Region_id  |
+-----------------------------------------------------+
|5544220000000000  |   5544225599999999  |     1      |
|5544225500000000  |   5544229999999999  |     2      |
|5511111100000000  |   5511111199999999  |     3      |
+-----------------------------------------------------+    

tab2

+------------------+
|       pan        |
+-------------------
|5544221111111111  |
|5544225524511244  |
|5511111111254577  |
+------------------+ 

So I run a query like this 所以我运行这样的查询

select t2.pan, t1.region_id from tab2 t2
 join tab1 t1 on t2.pan between t1.low and t1.high;

What I'm trying to do is finding in which range does tab2.pan exist and retrieving it's region_id : Ranges are unique, Meanning that low and high pairs are distinct. 我想做的是查找tab2.pan在哪个范围内并检索到它的region_id :范围是唯一的,这意味着高低对是不同的。
I tried adding indexes, running in parallel but the query is running very slow(about 3 hours). 我尝试添加并行运行的索引,但是查询运行非常慢(大约3小时)。 Can anyone suggest something to fasten the query, it can be adding some kind of indexes, or changing data structure or anything else. 任何人都可以提出一些建议来加强查询,可以添加某种索引,或更改数据结构或其他任何内容。
I'm running the query against Oracle 11gR2. 我正在针对Oracle 11gR2运行查询。
UPDATE UPDATE
From the comments i tested several things 从评论中我测试了几件事
Adding index like (high, low) and adding index (pan) and (high, low, region), Both ways there goes index full scan, i also tried index on(low,high) and index on pan, this way goes index range scan on tab1 and index full scan on tab2, but anyways it seems extremely slow. 添加索引(如(高,低)并添加索引(pan)和(高,低,区域)),两种方式都进行索引全扫描,我也尝试了on(低,high)索引和pan索引,这种方式就是索引在tab1上进行范围扫描,在tab2上进行索引全扫描,但是无论如何它似乎都非常慢。

If you have no overlaps and each value in tab1 matches exactly one row in tab2 , then I think the best approach is a correlated subquery with the right indexes: 如果没有重叠,并且tab1中的每个值都恰好匹配tab2一行,那么我认为最好的方法是使用正确索引的相关子查询:

select t.*, t2.region_id
from (select t1.*,
             (select max(t2.low)
              from tab2 t2
              where t2.low <= t.pan
             ) as t2low
      from tab1 t1
     ) t join
     tab2 t2
     on t.t2low = t2.low;

The index that you want is tab2(low, region) . 您想要的索引是tab2(low, region) This index should be used very efficiently for the subquery to get the closest low value. 该索引应非常有效地用于子查询以获取最接近的low The join should then be quite fast as well. 然后,连接也应该非常快。

Does this help your performance? 这对您的表现有帮助吗?

EDIT: 编辑:

I should note that in the above query, you can test for the high value in the outer join. 我应该注意,在上面的查询中,您可以测试外部联接中的high值。 This should be fine if the the low values are unique, because the join on low will be really fast. 如果low是唯一的,这应该没问题,因为low的连接将非常快。 So: 所以:

select t.*,
       (case when t.pan <= t2.high then t2.region_id end) as region_id
from (select t1.*,
             (select max(t2.low)
              from tab2 t2
              where t2.low <= t.pan
             ) as t2low
      from tab1 t1
     ) t join
     tab2 t2
     on t.t2low = t2.low;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM