i have in a procedure which fills a table the following sql
SELECT NVL(SUM(COL1), 0),
NVL(SUM(COL2), 0)
INTO v_mytable.COLUMN1,
v_mytable.COLUMN2
FROM t1, t2
WHERE t1.id = t2.id
AND t1.date = t2.date
also, for 99% of the table rows, those columns = 0 and this query take a long time to be executed when it will return 0 for both columns in most cases.
Is it better to use exception handeling as the following :
BEGIN
SELECT SUM(COL1),
SUM(COL2)
INTO v_mytable.COLUMN1,
v_mytable.COLUMN2
FROM t1, t2
WHERE t1.id = t2.id
AND t1.date = t2.date
EXCEPTION WHEN NO_DATA_FOUND THEN
v_mytable.COLUMN1 := 0 ;
v_mytable.COLUMN2 := 0 ;
END;
Thanks.
Those two blocks do completely different things. Your SELECT statement would not throw a NO_DATA_FOUND
error if COL1 and/or COL2 were always NULL. It would simply put a NULL in v_mytable.COLUMN1
and v_mytable.COLUMN2
.
You could do
SELECT SUM(COL1),
SUM(COL2)
INTO v_mytable.COLUMN1,
v_mytable.COLUMN2
FROM t1, t2
WHERE t1.id = t2.id
AND t1.date = t2.date
v_mytable.COLUMN1 := NVL( v_mytable.COLUMN1, 0 );
v_mytable.COLUMN2 := NVL( v_mytable.COLUMN2, 0 );
I wouldn't expect that to be any faster, however.
Given the choice between these two, I'd go for the first one.
I prefer to use exception handlers for genuine exceptions / errors, not control flow.
YMMV.
NO_DATA_FOUND would be thrown if no rows were returned, NOT if null values were returned in the actual rows that ARE returned from the query. This would throw NO_DATA_FOUND:
select sysdate
into myVariable
from dual
where 1=0;
This would NOT throw NO_DATA_FOUND:
select null
into myVariable
from dual;
That said, if you simply want to IGNORE the rows where col1 and col2 are null, then you may consider using collections in pl/sql, and use bulk collect into, something like:
select sum(col1) as sum_col1, sum(col2) as sum_col2, col3
bulk collect into v_mytable
FROM t1, t2
WHERE t1.id = t2.id
AND t1.date = t2.date
AND col1 is not null
AND col2 is not null
GROUP by col3;
No looping, do in one fell swoop. FYI, you would setup v_mytable something like:
declare
type t_rec is record
(col1_sum number,
col2_sum number,
col3 number);
v_rec t_rec;
type t_tab is table of v_rec%type;
v_mytable t_tab;
begin
...
Later you can loop through v_mytable, which will be only 1% of the original t1,t2 join result (due to the additional not null clauses in query).
Hope that helps.
Your SQL will run a lot faster if you stop joining the rows for which the col values are 0. Below is a small test to prove my point.
First create two tables with 100,000 rows, where 99% of the rows have their col value set to 0:
SQL> create table t1 (id,date1,col1)
2 as
3 select level
4 , trunc(sysdate)
5 , case mod(level,100) when 42 then 42 else 0 end
6 from dual
7 connect by level <= 100000
8 /
Table created.
SQL> create table t2 (id,date2,col2)
2 as
3 select level
4 , trunc(sysdate)
5 , case mod(level,100) when 42 then 84 else 0 end
6 from dual
7 connect by level <= 100000
8 /
Table created.
Give the cost based optimizer table statistics:
SQL> exec dbms_stats.gather_table_stats(user,'t1')
PL/SQL procedure successfully completed.
SQL> exec dbms_stats.gather_table_stats(user,'t2')
PL/SQL procedure successfully completed.
And gather statistics when running queries:
SQL> set serveroutput off
SQL> alter session set statistics_level = all
2 /
Session altered.
Now your query runs like this:
SQL> SELECT NVL(SUM(t1.COL1), 0)
2 , NVL(SUM(t2.COL2), 0)
3 FROM t1
4 , t2
5 WHERE t1.id = t2.id
6 AND t1.date1 = t2.date2
7 /
NVL(SUM(T1.COL1),0) NVL(SUM(T2.COL2),0)
------------------- -------------------
42000 84000
1 row selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
SQL_ID 6q5h7h8ht5232, child number 0
-------------------------------------
SELECT NVL(SUM(t1.COL1), 0) , NVL(SUM(t2.COL2), 0) FROM t1 , t2 WHERE t1.id = t2.id AND
t1.date1 = t2.date2
Plan hash value: 446739472
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.37 | 560 | | | |
|* 2 | HASH JOIN | | 1 | 100K| 100K|00:00:00.24 | 560 | 4669K| 1437K| 7612K (0)|
| 3 | TABLE ACCESS FULL| T1 | 1 | 100K| 100K|00:00:00.01 | 280 | | | |
| 4 | TABLE ACCESS FULL| T2 | 1 | 100K| 100K|00:00:00.01 | 280 | | | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID" AND "T1"."DATE1"="T2"."DATE2")
21 rows selected.
You can see that the HASH JOIN needs to join 100K rows, and this is where most of the time is spent. Now exclude the 0 values:
SQL> SELECT NVL(SUM(t1.COL1), 0)
2 , NVL(SUM(t2.COL2), 0)
3 FROM t1
4 , t2
5 WHERE t1.id = t2.id
6 AND t1.date1 = t2.date2
7 and t1.col1 != 0
8 and t2.col2 != 0
9 /
NVL(SUM(T1.COL1),0) NVL(SUM(T2.COL2),0)
------------------- -------------------
42000 84000
1 row selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
SQL_ID bjr7wrjx5tjvr, child number 0
-------------------------------------
SELECT NVL(SUM(t1.COL1), 0) , NVL(SUM(t2.COL2), 0) FROM t1 , t2 WHERE t1.id = t2.id AND
t1.date1 = t2.date2 and t1.col1 != 0 and t2.col2 != 0
Plan hash value: 446739472
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.02 | 560 | | | |
|* 2 | HASH JOIN | | 1 | 25000 | 1000 |00:00:00.02 | 560 | 1063K| 1063K| 1466K (0)|
|* 3 | TABLE ACCESS FULL| T1 | 1 | 50000 | 1000 |00:00:00.01 | 280 | | | |
|* 4 | TABLE ACCESS FULL| T2 | 1 | 50000 | 1000 |00:00:00.01 | 280 | | | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T1"."ID"="T2"."ID" AND "T1"."DATE1"="T2"."DATE2")
3 - filter("T1"."COL1"<>0)
4 - filter("T2"."COL2"<>0)
23 rows selected.
And you can see that the HASH JOIN now only needs to join 1000 rows, leading to a much faster output.
Hope this helps.
Regards,
Rob.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.