简体   繁体   中英

SQL : Oracle : Optimize LEFT Outer Join Query

I have 1 local table (all column names different from remote tables except one) and 2 remote tables (which have same column names) for which I need to combine the data.

Following is the query I have written using LEFT OUTER JOIN and UNION but the performance is slow.

Could anyone please help optimize this query?

select
"CONTROL_M_SERVER",
"HOST",
CASE
WHEN "AGSTAT" = 'V' THEN 'Available'
WHEN "AGSTAT" = 'U' THEN 'Unavailable'
WHEN "AGSTAT" = 'R' THEN 'Discovering'
ELSE 'Not Defined in Control-M'
END as Agent_Status,
T1.VERSION,
"PORTS",
"MANAGEMENT_IP",
"OPERATING_SYSTEM",
"CLUSTER_ALIAS",
"NODEGROUP",
"APPLICATION_ID",
"DATE_CONFIGURED",
"CONFIGURED_BY"
from "CTMAGENTAUDIT" T1
left outer join (select NODEID,AGSTAT from CMR_NODES@SPDB UNION ALL select NODEID,AGSTAT from CMR_NODES@DEVDB) T2 on T2.NODEID = T1.HOST;

The major issue I see with your query is the outermost left join between CTMAGENTAUDIT and the subquery which contains the union. The problem with that subquery is that, as written, Oracle can't possibly use any index for the join. This means that Oracle will probably have to resort to a slower method when joining, possibly a full scan.

One approach here would be to create a materialized view containing the union query, and then index it:

CREATE MATERIALIZED VIEW T2 AS
SELECT NODEID, AGSTAT FROM CMR_NODES@SPDB
UNION ALL
SELECT NODEID, AGSTAT FROM CMR_NODES@DEVDB;

CREATE INDEX mv_node_idx ON T2 (NODEID);

With this indexed materialized view in place, I would expect your query to perform much better now:

SELECT
    CONTROL_M_SERVER,
    HOST,
    CASE WHEN AGSTAT = 'V' THEN 'Available'
         WHEN AGSTAT = 'U' THEN 'Unavailable'
         WHEN AGSTAT = 'R' THEN 'Discovering'
         ELSE 'Not Defined in Control-M' END AS Agent_Status,
    T1.VERSION,
    PORTS,
    MANAGEMENT_IP,
    OPERATING_SYSTEM,
    CLUSTER_ALIAS,
    NODEGROUP,
    APPLICATION_ID,
    DATE_CONFIGURED,
    CONFIGURED_BY
FROM CTMAGENTAUDIT T1
LEFT OUTER JOIN T2
    ON T2.NODEID = T1.HOST;

I'd do something like this:

select
"CONTROL_M_SERVER",
"HOST",
CASE
WHEN "AGSTAT" = 'V' THEN 'Available'
WHEN "AGSTAT" = 'U' THEN 'Unavailable'
WHEN "AGSTAT" = 'R' THEN 'Discovering'
ELSE 'Not Defined in Control-M'
END as Agent_Status,
T1.VERSION,
"PORTS",
"MANAGEMENT_IP",
"OPERATING_SYSTEM",
"CLUSTER_ALIAS",
"NODEGROUP",
"APPLICATION_ID",
"DATE_CONFIGURED",
"CONFIGURED_BY",
(select t2.NODEID, t2.AGSTAT from CMR_NODES@SPDB t2 where t1.host = t2.nodeid),  
(select t3.NODEID, t3.AGSTAT from CMR_NODES@DEVDB t3 where T1.HOST = T3.NODEID) from t1;

Your query is basically:

select . . .
from CTMAGENTAUDIT T1 left outer join
     (select NODEID, AGSTAT
      from CMR_NODES@SPDB UNION ALL
      select NODEID, AGSTAT
      from CMR_NODES@DEVDB
     ) T2
     on T2.NODEID = T1.HOST;

Assuming NODEID / AGSTAT combos are unique in each of the CMR_NODE tables, I would write this as:

select . . .,
       coalesce(s1.AGSTAT, s2.AGSTAT) as AGSTAT,
       (case coalesce(s1.AGSTAT, s2.AGSTAT) 
            when 'V' then 'Available'
            when 'U' then 'Unavailable'
            when 'R' then 'Discovering'
            else 'Not Defined in Control-M'
        end) as Agent_Status
from CTMAGENTAUDIT T1 left outer join
     CMR_NODES@SPDB s1
     on s1.NODEID = T1.HOST left outer join
     CMR_NODES@DEVDB s2
     on s1.NODEID = T1.HOST

This will at least allow each table to be optimized separately -- which should help.

Obviously, the solution with the materialized view will be faster, if you have the permissions and desire to set up cross-server materialized views. There are additional maintenance issues with materialized views, particularly if you have multiple such views and assume that they are updated at the same time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM