简体   繁体   中英

Optimize Oracle SQL Select with multiple MAX subqueries

In an Oracle 12c database I have the following query:

SELECT pd.product_id,
       sd.column_1,
       sd.column_2,
(SELECT ROUND ( SUM (cost_1 + cost_2) / MAX (ref_coefficient), 4)
   FROM cost_reference
  WHERE product_name = sd.product_name)
     AS weighted_cost,
(SELECT cost_1 + cost_2
   FROM cost_reference
  WHERE product_name = sd.product_name 
    AND ref_coefficient = (SELECT MAX (ref_coefficient)
                              FROM cost_reference
                             WHERE product_name = sd.product_name ))
     AS normal_cost
   FROM sales_data sd, product_data pd
  WHERE sd.product_name = pd.product_name 
    AND sd.date = SYSDATE

The cost_reference table is used for pulling a certain coefficient based on what is loaded in it at the current time. The weighted_cost and normal_cost columns are calculated the subqueries shown. However, the multiple calls for MAX(ref_coefficient) really slows this query down. The whole thing takes about 3 seconds, but we would like to try to cut that time down since this is called somewhere where the data is refreshed constantly and the end result only has about 50 rows. Cost_reference usually contains about 2500 rows.

The proper indexes are there as well, because when they weren't it took about 10 seconds. If I take out these calculated columns, the query is instant. I have tried either joining the cost_reference table or using a WITH statement for it, but I am not having any luck. Is there any way to further optimize this query?

Interesting Note: As we have Dell Toad for our SQL IDE, I ran a built in Optimization tool on it and noticed that one of the queries cut the time in half. However upon inspecting it, all it did was put this comment in the normal_cost calculation. If there aren't any optimization suggestions, does anyone know why this worked? When I remove it, it goes back to its slower behavior.

SELECT /*+ FULL(COST_REFERENCE) */
      cost_1 + cost_2
 FROM cost_reference

Write the query using proper JOIN syntax. I also recommend qualifying all column names, particular when using correlated subqueries.

You can rewrite the second subquery so it uses aggregation. And because sysdate has a time component, it is highly unlikely that your WHERE clause does what you expect.

So, I would rewrite the query as:

SELECT pd.product_id, sd.column_1, sd.column_2,
       (SELECT ROUND(SUM(cost_1 + cost_2) / MAX(ref_coefficient), 4)
        FROM cost_reference cr
        WHERE cr.product_name = sd.product_name
       ) AS weighted_cost,
       (SELECT MAX(cr.cost_1 + cr.cost_2) KEEP (DENSE_RANK FIRST ORDER BY ref_coefficient DESC)
        FROM cost_reference cr
        WHERE cr.product_name = sd.product_name 
       ) AS normal_cost
FROM sales_data sd JOIN
     product_data pd
     ON sd.product_name = pd.product_name 
WHERE sd.date = TRUNC(SYSDATE)

For this query, you want the following indexes:

  • sales_data(date, product_name)
  • product_data(product_name)
  • cost_reference(product_name, ref_coefficient, cost_1, cost_2)

As you say that cost_reference has 2.5k rows, then you can try this (

SELECT pd.product_id, sd.column_1, sd.column_2,
    weighted_cost,
    normal_cost
FROM sales_data sd JOIN
    product_data pd
    ON sd.product_name = pd.product_name 
    join (select product_name, ROUND(SUM(cost_1 + cost_2) / MAX(ref_coefficient), 4) weighted_cost,
    MAX(cr.cost_1 + cr.cost_2) KEEP (DENSE_RANK FIRST ORDER BY ref_coefficient DESC) normal_cost 
    from cost_reference cr
    group by product_name) cr on (cr.product_name=sd.product_name)
WHERE sd.date = TRUNC(SYSDATE)

The idea is to have just one aggregation of cost_reference, and as it is small, the result set is also small. NB: did not check the sql in full with these tables.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM