简体   繁体   English

Oracle中的慢查询(在SQL Server中快速运行)(相关子查询)

[英]A slow query in Oracle (runs fast in SQL Server) (a correlated subquery)

We are annoyed by the symptom below. 我们对以下症状感到恼火。 We would appreciate any advice from you about the cause of it and the way to resolve it. 我们非常感谢您就其原因及解决方法提出的任何建议。

I run the simple query below and it took 3,800 seconds for Oracle DB to return the result, while the same query completes in a few seconds in a SQL Server database that has the same tables. 我在下面运行简单查询,Oracle DB需要3,800秒才能返回结果,而同一查询在几秒钟内在具有相同表的SQL Server数据库中完成。 (It's used as a datamart). (它用作数据集市)。

Query: 查询:

select
T_X.Col1
,(select count(1) from T_X where T_X.colX = T_Y.colY) as cnt1
from T_Y

Records count: 记录数:

T_X: 96,536 T_X:96,536

T_Y: 129,359 T_Y:129,359

Other info: 其他信息:

-ColY is the primary key of T_Y and there is no index on ColXin in both of the two enrironment. -ColY是T_Y的主键,两个环境中的ColXin都没有索引。

-Oracle 11.1 (ran the query using SQL Developer) -Oracle 11.1(使用SQL Developer运行查询)

-SQL Server 2008 (ran the query using SSMS) -SQL Server 2008(使用SSMS运行查询)

-No big differences of hardware specs between the two environment. - 两种环境之间硬件规格没有太大差异。

-The query above is a part of the bigger one. - 上面的查询是较大的一部分。 We simplified it and find the part was the bottleneck. 我们对其进行了简化,发现该部分是瓶颈。

We would appreciate your advice! 我们非常感谢您的建议!

Additional Info (the purpose of the query) 附加信息(查询的目的)

The query above is a part of the query below. 上面的查询是下面查询的一部分。 Our purpose is to find out (the count of) records in T_Y that doesn't have correspondent records in other tables (T_A, T_B, T_C, T_D, T_E, T_X). 我们的目的是找出T_Y中的记录(计数)在其他表中没有对应记录(T_A,T_B,T_C,T_D,T_E,T_X)。

select          
  count(1)          
from            
  (select
     T_Y.ColA
    ,T_Y.ColG
    ,T_Y.ColH
    ,(select count(1) from T_A A where A.ColA = T_Y.ColY) as cnt1
    ,(select count(1) from T_B B where B.ColB = T_Y.ColY) as cnt2
    ,(select count(1) from T_X where T_X.ColX = T_Y.ColY) as cnt3
    ,(select count(1) from T_C C where C.ColC = T_Y.ColY) as cnt4
    ,(select count(1) from T_D D where D.ColD = T_Y.ColY) as cnt5
    ,(select count(1) from T_E E where E.ColE = T_Y.ColY) as cnt6
  from T_Y
  )XXX
where 1=1
  and XXX.ColH in ('X')
  and XXX.cnt1 = 0
  and XXX.cnt2 = 0
  and XXX.cnt3 = 0
  and XXX.cnt4 = 0
  and XXX.cnt5 = 0
  and XXX.cnt6 = 0
;           

Execution Plan - Oracle (for the original query)(Execute Explain Plan) 执行计划 - Oracle(原始查询)(执行解释计划)

"Optimizer" "Cost"  "Cardinality"   "Bytes" "Partition Start"   "Partition Stop"    "Partition Id"  "ACCESS PREDICATES" "FILTER PREDICATES"
"SELECT STATEMENT"  "ALL_ROWS"  "121"   "129359"    "776154"    ""  ""  ""  ""  ""
"SORT(AGGREGATE)"   ""  ""  "1" "6" ""  ""  ""  ""  ""
"TABLE ACCESS(FULL) XXXXX.T_X"  "ANALYZED"  "6616"  "2" "12"    ""  ""  ""  ""  ""T_X"."ColX"=:B1"
"INDEX(FAST FULL SCAN) XXXXX.T_Y_0" "ANALYZED"  "121"   "129359"    "776154"    ""  ""  ""  ""  ""

Execution Plan - SQL Server (for the original query) 执行计划 - SQL Server(用于原始查询)

Line 7 indicates that SQL server uses Clustered Index Scan instead of Table Scan though the clustered index doesn't include ColumnY. 第7行表示SQL Server使用聚簇索引扫描而不是表扫描,尽管聚簇索引不包括ColumnY。 Could anyone explain what does this mean? 有谁能解释这是什么意思? Can I force Oracle to use the similar Execution Plan using hint clause or anything? 我可以强制Oracle使用类似的执行计划使用提示子句或任何东西吗?

  |--Compute Scalar(DEFINE:([Expr1008]=CASE WHEN [Expr1006] IS NULL THEN (0) ELSE [Expr1006] END))
       |--Parallelism(Gather Streams)
            |--Hash Match(Right Outer Join, HASH:([DB_X].[dbo].[T_X].[ColX])=([DB_X].[dbo].[T_Y].[ColY]), RESIDUAL:([DB_X].[dbo].[T_X].[ColX]=[DB_X].[dbo].[T_Y].[ColY]))
                 |--Compute Scalar(DEFINE:([Expr1006]=CONVERT_IMPLICIT(int,[Expr1013],0)))
                 |    |--Hash Match(Aggregate, HASH:([DB_X].[dbo].[T_X].[ColX]), RESIDUAL:([DB_X].[dbo].[T_X].[ColX] = [DB_X].[dbo].[T_X].[ColX]) DEFINE:([Expr1013]=COUNT(*)))
                 |         |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([DB_X].[dbo].[T_X].[ColX]))
                 |              |--Clustered Index Scan(OBJECT:([DB_X].[dbo].[T_X].[PK_T_X]))
                 |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([DB_X].[dbo].[T_Y].[ColY]))
                      |--Clustered Index Scan(OBJECT:([DB_X].[dbo].[T_Y].[PK_T_Y]))

You need an index on t_x(colX) . 你需要一个关于t_x(colX)的索引。 I am guessing this index exists on SQL Server. 我猜这个索引存在于SQL Server上。

This version might be faster on either machine: 这两个版本的版本可能更快:

select t_x.colX, count(1)
from T_X 
group by t_x.colX;

It is not exactly the same, but it might be what you really want. 它不完全相同,但它可能是你真正想要的。

Our purpose is to find out (the count of) records in T_Y that doesn't have correspondent records in other tables (T_A, T_B, T_C, T_D, T_E, T_X). 我们的目的是找出T_Y中的记录(计数)在其他表中没有对应记录(T_A,T_B,T_C,T_D,T_E,T_X)。

That would better be expressed as: 那最好表达为:

select          
  count(*)          
from            
  T_Y
where ColH in ('X') and
      not exists (select null from T_A A where A.ColA = T_Y.ColY) and
      not exists (select null from T_B B where B.ColB = T_Y.ColY) and
      not exists (select null from T_X where T_X.ColX = T_Y.ColY) and
      not exists (select null from T_C C where C.ColC = T_Y.ColY) and
      not exists (select null from T_D D where D.ColD = T_Y.ColY) and
      not exists (select null from T_E E where E.ColE = T_Y.ColY);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM