[英]Oracle: Optimizing twice self-join query
我正在努力使過去兩天的查詢效率更高。 我已經了解了有關Oracle索引行為的更多信息,我認為在這一點上我感到困惑,哪些應該起作用,哪些不起作用。
基本上,查詢是對值求和並與昨天和上周的值進行比較。
我一直在努力分解它,在腦海里玩弄分析查詢和更改索引順序,但是似乎沒有任何效果。 我所有的測試都是在具有50萬行的表上進行的,一旦我在具有2000萬行的表上進行了測試,那將永遠花光。
任何幫助是極大的贊賞。
我修改了原始帖子以幫助您。 :)
CREATE TABLE TABLE_1
(ORDER_LINE_ID NUMBER, OFFSET NUMBER, BREAK_ID NUMBER, ZONE NUMBER, NETWORK NUMBER, HOUR_OF_DAY NUMBER, START_TIME DATE, END_TIME DATE, SUCCESS NUMBER
CONSTRAINT "TABLE_1_PK" PRIMARY KEY (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, HOUR_OF_DAY))
-- SUCCESS is already aggregated during the insert
-- These are last week's records
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (1,0,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (1,30,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 2);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (2,0,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (2,30,1, 1, 1, 2016042001,'04/20/2016 00:00:00', '04/20/2016 02:00:00', 1);
-- These are yesterday's records
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (3,0,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (3,30,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 2);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (4,0,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (4,30,1, 1, 1, 2016042601,'04/26/2016 00:00:00', '04/26/2016 02:00:00', 1);
-- This is today's records
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (5,0,1, 1, 1, 2016042701,'04/27/2016 00:00:00', '04/27/2016 02:00:00', 1);
INSERT INTO TABLE_1 (ORDER_LINE_ID, OFFSET, BREAK_ID, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME, SUCCESS)
VALUES (5,30,1, 1, 1, 2016042701,'04/27/2016 00:00:00', '04/27/2016 02:00:00', 1);
-- Original twice join query
SELECT BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME, SUM(SUCCESS), SUM(YESTERDAY_SUCCESS), SUM(LAST_WEEK_SUCCESS)
FROM TABLE_1 CURRENT_DAY
LEFT OUTER JOIN (
SELECT SUM(SUCCESS) YESTERDAY_SUCCESS, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME FROM TABLE_1
GROUP BY ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME
) YESTERDAY
ON YESTERDAY.START_TIME + 1 = CURRENT_DAY.START_TIME
AND YESTERDAY.END_TIME + 1 = CURRENT_DAY.END_TIME
AND YESTERDAY.HOUR_OF_DAY = CURRENT_DAY.HOUR_OF_DAY
AND YESTERDAY.NETWORK = CURRENT_DAY.NETWORK
AND YESTERDAY.ZONE = CURRENT_DAY.ZONE
LEFT OUTER JOIN (
SELECT SUM(SUCCESS) LAST_WEEK_SUCCESS, ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME FROM TABLE_1
GROUP BY ZONE, NETWORK, HOUR_OF_DAY, START_TIME, END_TIME
) LAST_WEEK
ON YESTERDAY.START_TIME + 7 = CURRENT_DAY.START_TIME
AND YESTERDAY.END_TIME + 7 = CURRENT_DAY.END_TIME
AND YESTERDAY.HOUR_OF_DAY = CURRENT_DAY.HOUR_OF_DAY
AND YESTERDAY.NETWORK = CURRENT_DAY.NETWORK
AND YESTERDAY.ZONE = CURRENT_DAY.ZONE
GROUP BY BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME;
-- Using Analytic Query (thank you to MT0)
SELECT BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME, SUM(SUCCESS), SUM(YESTERDAY_SUCCESS), SUM(LAST_WEEK_SUCCESS)
FROM (
SUM( SUCCESS )
OVER ( PARTITION BY ZONE, NETWORK, HOUR_OF_DAY, TO_CHAR(START_TIME, 'HH24:MI:SS'), TO_CHAR(END_TIME, 'HH24:MI:SS')
ORDER BY START_TIME
RANGE BETWEEN INTERVAL '1' DAY PRECDEDING AND INTERVAL '1' DAY PRECEDING
) AS YESTERDAY_SUCCESS,
SUM ( SUCCESS )
OVER ( PARTITION BY ZONE, NETWORK, HOUR_OF_DAY, TO_CHAR(START_TIME, 'HH24:MI:SS'), TO_CHAR(END_TIME, 'HH24:MI:SS')
ORDER BY START_TIME
RANGE BETWEEN INTERVAL '7' DAY PRECDEDING AND INTERVAL '7' DAY PRECEDING
) AS LAST_WEEK_SUCCESS
FROM TABLE_1
) T1
WHERE SYSDATE - INTERVAL '12' HOUR <= START_TIME
AND START_TIME < SYSDATE - INTERVAL '1' HOUR
GROUP BY BREAK_ID, ORDER_LINE_ID, HOUR_OF_DAY, OFFSET, ZONE, NETWORK, START_TIME, END_TIME;
我必須說謝謝您為將這個問題提出我希望可以理解的問題所提供的幫助。 一切正常,但性能可能需要一些調整。
在具有500K行的表上需要1.8秒
擁有2000萬行的表格400秒
我還想添加Oracle提供的一些執行計划。 我在調整性能時遇到問題。
-- using twice self join
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | O/1/M |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 50 |00:00:00.84 | 99875 | 217 | 1705 | | | |
| 1 | HASH GROUP BY | | 1 | 6711 | 50 |00:00:00.84 | 99875 | 217 | 1705 | 1616K| 995K| |
|* 2 | FILTER | | 1 | | 119K|00:00:00.65 | 99875 | 0 | 0 | | | |
| 3 | NESTED LOOPS OUTER | | 1 | 54M| 119K|00:00:00.64 | 99875 | 0 | 0 | | | |
|* 4 | HASH JOIN OUTER | | 1 | 109 | 119K|00:00:00.52 | 99875 | 0 | 0 | 13M| 2093K| 1/0/0|
| 5 | TABLE ACCESS BY INDEX ROWID| TABLE_1_IDX | 1 | 109 | 119K|00:00:00.14 | 85908 | 0 | 0 | | | |
|* 6 | INDEX RANGE SCAN | START_TIME_IDX | 1 | 109 | 119K|00:00:00.02 | 320 | 0 | 0 | | | |
| 7 | VIEW | | 1 | 1250 | 29311 |00:00:00.23 | 13967 | 0 | 0 | | | |
| 8 | HASH GROUP BY | | 1 | 1250 | 29311 |00:00:00.22 | 13967 | 0 | 0 | 3008K| 1094K| 1/0/0|
|* 9 | FILTER | | 1 | | 88627 |00:00:00.20 | 13967 | 0 | 0 | | | |
|* 10 | TABLE ACCESS FULL | TABLE_1 | 1 | 1250 | 88627 |00:00:00.19 | 13967 | 0 | 0 | | | |
| 11 | VIEW | | 119K| 499K| 0 |00:00:00.10 | 0 | 0 | 0 | | | |
| 12 | SORT GROUP BY | | 119K| 499K| 0 |00:00:00.08 | 0 | 0 | 0 | 1024 | 1024 | 1/0/0|
|* 13 | FILTER | | 119K| | 0 |00:00:00.02 | 0 | 0 | 0 | | | |
| 14 | TABLE ACCESS FULL | TABLE_1 | 0 | 499K| 0 |00:00:00.01 | 0 | 0 | 0 | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(SYSDATE@!-17<SYSDATE@!-16)
4 - access("YESTERDAY"."ZONE"="T1"."ZONE" AND "YESTERDAY"."NETWORK"="T1"."NETWORK" AND "YESTERDAY"."HOUR_OF_DAY"="T1"."HOUR_OF_DAY"
AND "T1"."END_TIME"=INTERNAL_FUNCTION("YESTERDAY"."END_TIME")+1 AND
"T1"."START_TIME"=INTERNAL_FUNCTION("YESTERDAY"."START_TIME")+1)
6 - access("T1"."START_TIME">=SYSDATE@!-17 AND "T1"."START_TIME"<SYSDATE@!-16)
9 - filter(SYSDATE@!-17<SYSDATE@!-16)
10 - filter((INTERNAL_FUNCTION("START_TIME")+1>=SYSDATE@!-17 AND INTERNAL_FUNCTION("START_TIME")+1<SYSDATE@!-16))
13 - filter(("YESTERDAY"."ZONE"="T1"."ZONE" AND "YESTERDAY"."NETWORK"="T1"."NETWORK" AND "YESTERDAY"."HOUR_OF_DAY"="T1"."HOUR_OF_DAY"
AND "T1"."END_TIME"=INTERNAL_FUNCTION("YESTERDAY"."END_TIME")+7 AND
"T1"."START_TIME"=INTERNAL_FUNCTION("YESTERDAY"."START_TIME")+7))
使用分析查詢的另一個執行計划(再次感謝MT0)
-- using analytic query
-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | O/1/M |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 50 |00:00:01.51 | 13967 | | | |
| 1 | HASH GROUP BY | | 1 | 499K| 50 |00:00:01.51 | 13967 | 98M| 7788K| |
|* 2 | VIEW | | 1 | 499K| 119K|00:00:01.15 | 13967 | | | |
| 3 | WINDOW SORT | | 1 | 499K| 499K|00:00:01.43 | 13967 | 66M| 2823K| 1/0/0|
|* 4 | FILTER | | 1 | | 499K|00:00:00.16 | 13967 | | | |
| 5 | TABLE ACCESS FULL| TABLE_1 | 1 | 499K| 499K|00:00:00.12 | 13967 | | | |
-------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(("T1"."START_TIME">=SYSDATE@!-INTERVAL'+17 00:00:00' DAY(2) TO SECOND(0) AND
"T1"."START_TIME"<SYSDATE@!-INTERVAL'+16 00:00:00' DAY(2) TO SECOND(0)))
4 - filter(SYSDATE@!-INTERVAL'+17 00:00:00' DAY(2) TO SECOND(0)<SYSDATE@!-INTERVAL'+16 00:00:00' DAY(2) TO
SECOND(0))
如您所見,我在start_time上添加了一個索引,自聯接查詢可從中受益,但估計與實際情況不符。 Analytic Query僅決定它與索引無關。 任何想法,參考點或幫助將不勝感激。 預先感謝大家。
目前尚不清楚為什么僅在今天和昨天(或上周)的行完全相同的情況下才加入,但是如果您只想在特定時間之間存在行,則可以消除所有自聯接並執行以下操作:
SELECT order_line,
zone,
network,
sum(
CASE WHEN SYSDATE - INTERVAL '12' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '1' HOUR
THEN success
END
) AS total_successes_today,
sum(
CASE WHEN SYSDATE - INTERVAL '12' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '1' HOUR
THEN error
END
) AS total_errors_today,
sum(
CASE WHEN SYSDATE - INTERVAL '36' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '25' HOUR
THEN success
END
) AS total_successes_yesterday,
sum(
CASE WHEN SYSDATE - INTERVAL '180' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '169' HOUR
THEN success
END
) AS total_successes_last_week
FROM table_1
WHERE ( SYSDATE - INTERVAL '12' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '1' HOUR ) -- today
OR ( SYSDATE - INTERVAL '36' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '25' HOUR ) -- yesterday = today + 24 hours
OR ( SYSDATE - INTERVAL '180' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '169' HOUR ) -- last week = today + 7*24 hours
但是,如果您確實希望保持連接的開始時間和結束時間,則可以使用分析查詢:
SELECT order_line,
zone,
network,
SUM( success ),
SUM( error ),
SUM( yesterday_success ),
SUM( last_week_success )
FROM (
SELECT t.*,
SUM( success )
OVER ( PARTITION BY id,
TO_CHAR( start_time, 'HH24:MI:SS' ),
TO_CHAR( end_time, 'HH24:MI:SS' )
ORDER BY start_time
RANGE BETWEEN INTERVAL '1' DAY PRECEDING AND INTERVAL '1' DAY PRECEDING
) AS yesterday_success,
SUM( success )
OVER ( PARTITION BY id,
TO_CHAR( start_time, 'HH24:MI:SS' ),
TO_CHAR( end_time, 'HH24:MI:SS' )
ORDER BY start_time
RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND INTERVAL '7' DAY PRECEDING
) AS last_week_success
FROM TABLE_1 t
)
WHERE SYSDATE - INTERVAL '12' HOUR <= start_time
AND start_time < SYSDATE - INTERVAL '1' HOUR
GROUP BY
order_line,
zone,
network
ORDER BY
order_line,
zone,
network
您可以查看是否可以通過在TO_CHAR( start_time, 'HH24:MI:SS' )
和TO_CHAR( end_time, 'HH24:MI:SS' )
上使用基於函數的索引來提高速度。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.