简体   繁体   中英

NVL or EXCEPTION NO_DATA_FOUND

i have in a procedure which fills a table the following sql

SELECT NVL(SUM(COL1), 0),
       NVL(SUM(COL2), 0)
  INTO v_mytable.COLUMN1,
       v_mytable.COLUMN2
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.date = t2.date

also, for 99% of the table rows, those columns = 0 and this query take a long time to be executed when it will return 0 for both columns in most cases.

Is it better to use exception handeling as the following :

BEGIN
  SELECT SUM(COL1),
         SUM(COL2)
    INTO v_mytable.COLUMN1,
         v_mytable.COLUMN2
    FROM t1, t2
   WHERE t1.id = t2.id
     AND t1.date = t2.date
EXCEPTION WHEN NO_DATA_FOUND THEN
  v_mytable.COLUMN1 := 0 ;
  v_mytable.COLUMN2 := 0 ;
END;

Thanks.

Those two blocks do completely different things. Your SELECT statement would not throw a NO_DATA_FOUND error if COL1 and/or COL2 were always NULL. It would simply put a NULL in v_mytable.COLUMN1 and v_mytable.COLUMN2 .

You could do

SELECT SUM(COL1),
       SUM(COL2)
  INTO v_mytable.COLUMN1,
       v_mytable.COLUMN2
  FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.date = t2.date

v_mytable.COLUMN1 := NVL( v_mytable.COLUMN1, 0 );
v_mytable.COLUMN2 := NVL( v_mytable.COLUMN2, 0 );

I wouldn't expect that to be any faster, however.

Given the choice between these two, I'd go for the first one.

I prefer to use exception handlers for genuine exceptions / errors, not control flow.

YMMV.

NO_DATA_FOUND would be thrown if no rows were returned, NOT if null values were returned in the actual rows that ARE returned from the query. This would throw NO_DATA_FOUND:

select sysdate
into myVariable
from dual
where 1=0;

This would NOT throw NO_DATA_FOUND:

select null
into myVariable
from dual;

That said, if you simply want to IGNORE the rows where col1 and col2 are null, then you may consider using collections in pl/sql, and use bulk collect into, something like:

select sum(col1) as sum_col1, sum(col2) as sum_col2, col3
bulk collect into v_mytable
FROM t1, t2
 WHERE t1.id = t2.id
   AND t1.date = t2.date
   AND col1 is not null
   AND col2 is not null
GROUP by col3;

No looping, do in one fell swoop. FYI, you would setup v_mytable something like:

declare
  type t_rec is record
  (col1_sum number,
   col2_sum number,
   col3 number);
  v_rec t_rec;

  type t_tab is table of v_rec%type;
  v_mytable t_tab;

begin
...

Later you can loop through v_mytable, which will be only 1% of the original t1,t2 join result (due to the additional not null clauses in query).

Hope that helps.

Your SQL will run a lot faster if you stop joining the rows for which the col values are 0. Below is a small test to prove my point.

First create two tables with 100,000 rows, where 99% of the rows have their col value set to 0:

SQL> create table t1 (id,date1,col1)
  2  as
  3   select level
  4        , trunc(sysdate)
  5        , case mod(level,100) when 42 then 42 else 0 end
  6     from dual
  7  connect by level <= 100000
  8  /

Table created.

SQL> create table t2 (id,date2,col2)
  2  as
  3   select level
  4        , trunc(sysdate)
  5        , case mod(level,100) when 42 then 84 else 0 end
  6     from dual
  7  connect by level <= 100000
  8  /

Table created.

Give the cost based optimizer table statistics:

SQL> exec dbms_stats.gather_table_stats(user,'t1')

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user,'t2')

PL/SQL procedure successfully completed.

And gather statistics when running queries:

SQL> set serveroutput off
SQL> alter session set statistics_level = all
  2  /

Session altered.

Now your query runs like this:

SQL> SELECT NVL(SUM(t1.COL1), 0)
  2       , NVL(SUM(t2.COL2), 0)
  3    FROM t1
  4       , t2
  5   WHERE t1.id = t2.id
  6     AND t1.date1 = t2.date2
  7  /

NVL(SUM(T1.COL1),0) NVL(SUM(T2.COL2),0)
------------------- -------------------
              42000               84000

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
SQL_ID  6q5h7h8ht5232, child number 0
-------------------------------------
SELECT NVL(SUM(t1.COL1), 0)      , NVL(SUM(t2.COL2), 0)   FROM t1      , t2  WHERE t1.id = t2.id    AND
t1.date1 = t2.date2

Plan hash value: 446739472

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.37 |     560 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |    100K|    100K|00:00:00.24 |     560 |  4669K|  1437K| 7612K (0)|
|   3 |    TABLE ACCESS FULL| T1   |      1 |    100K|    100K|00:00:00.01 |     280 |       |       |          |
|   4 |    TABLE ACCESS FULL| T2   |      1 |    100K|    100K|00:00:00.01 |     280 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."ID"="T2"."ID" AND "T1"."DATE1"="T2"."DATE2")


21 rows selected.

You can see that the HASH JOIN needs to join 100K rows, and this is where most of the time is spent. Now exclude the 0 values:

SQL> SELECT NVL(SUM(t1.COL1), 0)
  2       , NVL(SUM(t2.COL2), 0)
  3    FROM t1
  4       , t2
  5   WHERE t1.id = t2.id
  6     AND t1.date1 = t2.date2
  7     and t1.col1 != 0
  8     and t2.col2 != 0
  9  /

NVL(SUM(T1.COL1),0) NVL(SUM(T2.COL2),0)
------------------- -------------------
              42000               84000

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
  2  /

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------
SQL_ID  bjr7wrjx5tjvr, child number 0
-------------------------------------
SELECT NVL(SUM(t1.COL1), 0)      , NVL(SUM(t2.COL2), 0)   FROM t1      , t2  WHERE t1.id = t2.id    AND
t1.date1 = t2.date2    and t1.col1 != 0    and t2.col2 != 0

Plan hash value: 446739472

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|   1 |  SORT AGGREGATE     |      |      1 |      1 |      1 |00:00:00.02 |     560 |       |       |          |
|*  2 |   HASH JOIN         |      |      1 |  25000 |   1000 |00:00:00.02 |     560 |  1063K|  1063K| 1466K (0)|
|*  3 |    TABLE ACCESS FULL| T1   |      1 |  50000 |   1000 |00:00:00.01 |     280 |       |       |          |
|*  4 |    TABLE ACCESS FULL| T2   |      1 |  50000 |   1000 |00:00:00.01 |     280 |       |       |          |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."ID"="T2"."ID" AND "T1"."DATE1"="T2"."DATE2")
   3 - filter("T1"."COL1"<>0)
   4 - filter("T2"."COL2"<>0)


23 rows selected.

And you can see that the HASH JOIN now only needs to join 1000 rows, leading to a much faster output.

Hope this helps.

Regards,
Rob.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM