[英]How to optimize performance of this SQL query
我需要找到每一天的年齡,但是我需要在一個查詢中找到所有以前的日期。 因此,我使用了以下查詢:
select trunc(sysdate) - level + 1 **DATE**
,trunc(sysdate) - level + 1 - created_date **AGE** from items
connect by trunc(sysdate) - level + 1 - created_date > 0
我得到的輸出(日期和年齡)正確無誤:
DATE AGE
--------- ----------
6-JUL-15 22
5-JUL-15 21
4-JUL-15 20
3-JUL-15 19
2-JUL-15 18
1-JUL-15 17
30-JUN-15 16
29-JUN-15 15
28-JUN-15 14
27-JUN-15 13
26-JUN-15 12
25-JUN-15 11
24-JUN-15 10
現在,我需要計算每天的平均年齡 ,因此在以下查詢中添加了平均年齡 :
select trunc(sysdate) - level + 1 **DATE** ,
**avg**(trunc(sysdate) - level + 1 - created_date )** AVERAGE_AGE**
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
group by trunc(sysdate) - level + 1
這個查詢正確嗎? 當我向該查詢添加聚合函數(avg)時,它需要1個小時來檢索數據。 當我從查詢中刪除平均值函數時,它會在2秒內給出結果嗎? 在不影響性能的情況下計算平均值的可能解決方案是什么?
抱歉,我從未使用過Oracle,因此即使我嘗試閱讀doc以獲得語法詳細信息,也可能會出現一些錯誤:
您說此查詢在2秒內完成了工作:
select trunc(sysdate) - level + 1 **DATE**
,trunc(sysdate) - level + 1 - created_date **AGE** from items
connect by trunc(sysdate) - level + 1 - created_date > 0
因此,我們將保留它並從中進行view
:
CREATE OR REPLACE VIEW my_view AS
(select
trunc(sysdate) - level + 1 **DATE** AS "date_col",
trunc(sysdate) - level + 1 - created_date **AGE** AS "age_col"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0);
但是通過執行以下操作,可能我們可以獲得一些多余的計算:
CREATE OR REPLACE VIEW distinct_dates AS
(
SELECT DISTINCT trunc(sysdate) - level + 1 AS "date_distinct"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
);
CREATE OR REPLACE VIEW my_view AS
(select
date_distinct AS "date_col",
date_distinct - created_date AS "age_col"
from distinct_dates
connect by date_distinct - created_date > 0);
我為什么要這么做? 因為看來問題出在聚合上,所以恐怕視圖實際上是在您的代碼中多次計算的。 下一步只是在視圖上進行計算:
select
date_col ,
AVG(age_col)
from my_view
group by date_col;
最后,最終代碼將是:
CREATE OR REPLACE VIEW distinct_dates AS
(
SELECT DISTINCT trunc(sysdate) - level + 1 AS "date_distinct"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0
);
CREATE OR REPLACE VIEW my_view AS
(select
date_distinct AS "date_col",
date_distinct - created_date AS "age_col"
from distinct_dates
connect by date_distinct - created_date > 0);
select
date_col ,
AVG(age_col)
from my_view
group by date_col;
或者,如果它不起作用:
CREATE OR REPLACE VIEW my_view AS
(select
trunc(sysdate) - level + 1 **DATE** AS "date_col",
trunc(sysdate) - level + 1 - created_date **AGE** AS "age_col"
from items
connect by trunc(sysdate) - level + 1 - created_date > 0);
select
date_col ,
AVG(age_col)
from my_view
group by date_col;
修改后的查詢:
select tdate, avg(trunc(tdate)-created_date) AVG_AGE
from (
select trunc(sysdate) - level + 1 tdate
from (select min(created_date) dt from items)
connect by trunc(sysdate) - level + 1 - dt > 0 ) dates
join items on dates.tdate > items.created_date
group by tdate order by tdate
假設您只有兩行,日期分別為“ 2015-06-01”和“ 2015-06-20”。 根據我的計算,您的層次查詢為它們生成了1376254行,這可能不是您想要的,它應該生成51行(35 + 16)。 這就是為什么要花這么長時間的原因,因為表items
更多行輸出呈指數增長。
您可以通過添加某種計數器(由rownum
或row_number
生成),然后添加and prior rn = rn
connect by
子句來修改查詢,但是上面顯示的查詢使查詢更簡單。 我在SQLFiddle中添加了第二個查詢以比較結果,兩者均產生相同的輸出。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.