In an Oracle 12c database I have the following query:
SELECT pd.product_id,
sd.column_1,
sd.column_2,
(SELECT ROUND ( SUM (cost_1 + cost_2) / MAX (ref_coefficient), 4)
FROM cost_reference
WHERE product_name = sd.product_name)
AS weighted_cost,
(SELECT cost_1 + cost_2
FROM cost_reference
WHERE product_name = sd.product_name
AND ref_coefficient = (SELECT MAX (ref_coefficient)
FROM cost_reference
WHERE product_name = sd.product_name ))
AS normal_cost
FROM sales_data sd, product_data pd
WHERE sd.product_name = pd.product_name
AND sd.date = SYSDATE
The cost_reference table is used for pulling a certain coefficient based on what is loaded in it at the current time. The weighted_cost and normal_cost columns are calculated the subqueries shown. However, the multiple calls for MAX(ref_coefficient) really slows this query down. The whole thing takes about 3 seconds, but we would like to try to cut that time down since this is called somewhere where the data is refreshed constantly and the end result only has about 50 rows. Cost_reference usually contains about 2500 rows.
The proper indexes are there as well, because when they weren't it took about 10 seconds. If I take out these calculated columns, the query is instant. I have tried either joining the cost_reference table or using a WITH statement for it, but I am not having any luck. Is there any way to further optimize this query?
Interesting Note: As we have Dell Toad for our SQL IDE, I ran a built in Optimization tool on it and noticed that one of the queries cut the time in half. However upon inspecting it, all it did was put this comment in the normal_cost calculation. If there aren't any optimization suggestions, does anyone know why this worked? When I remove it, it goes back to its slower behavior.
SELECT /*+ FULL(COST_REFERENCE) */
cost_1 + cost_2
FROM cost_reference
Write the query using proper JOIN
syntax. I also recommend qualifying all column names, particular when using correlated subqueries.
You can rewrite the second subquery so it uses aggregation. And because sysdate
has a time component, it is highly unlikely that your WHERE
clause does what you expect.
So, I would rewrite the query as:
SELECT pd.product_id, sd.column_1, sd.column_2,
(SELECT ROUND(SUM(cost_1 + cost_2) / MAX(ref_coefficient), 4)
FROM cost_reference cr
WHERE cr.product_name = sd.product_name
) AS weighted_cost,
(SELECT MAX(cr.cost_1 + cr.cost_2) KEEP (DENSE_RANK FIRST ORDER BY ref_coefficient DESC)
FROM cost_reference cr
WHERE cr.product_name = sd.product_name
) AS normal_cost
FROM sales_data sd JOIN
product_data pd
ON sd.product_name = pd.product_name
WHERE sd.date = TRUNC(SYSDATE)
For this query, you want the following indexes:
sales_data(date, product_name)
product_data(product_name)
cost_reference(product_name, ref_coefficient, cost_1, cost_2)
As you say that cost_reference has 2.5k rows, then you can try this (
SELECT pd.product_id, sd.column_1, sd.column_2,
weighted_cost,
normal_cost
FROM sales_data sd JOIN
product_data pd
ON sd.product_name = pd.product_name
join (select product_name, ROUND(SUM(cost_1 + cost_2) / MAX(ref_coefficient), 4) weighted_cost,
MAX(cr.cost_1 + cr.cost_2) KEEP (DENSE_RANK FIRST ORDER BY ref_coefficient DESC) normal_cost
from cost_reference cr
group by product_name) cr on (cr.product_name=sd.product_name)
WHERE sd.date = TRUNC(SYSDATE)
The idea is to have just one aggregation of cost_reference, and as it is small, the result set is also small. NB: did not check the sql in full with these tables.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.