简体   繁体   English

使用多个 MAX 子查询优化 Oracle SQL Select

[英]Optimize Oracle SQL Select with multiple MAX subqueries

In an Oracle 12c database I have the following query:在 Oracle 12c 数据库中,我有以下查询:

SELECT pd.product_id,
       sd.column_1,
       sd.column_2,
(SELECT ROUND ( SUM (cost_1 + cost_2) / MAX (ref_coefficient), 4)
   FROM cost_reference
  WHERE product_name = sd.product_name)
     AS weighted_cost,
(SELECT cost_1 + cost_2
   FROM cost_reference
  WHERE product_name = sd.product_name 
    AND ref_coefficient = (SELECT MAX (ref_coefficient)
                              FROM cost_reference
                             WHERE product_name = sd.product_name ))
     AS normal_cost
   FROM sales_data sd, product_data pd
  WHERE sd.product_name = pd.product_name 
    AND sd.date = SYSDATE

The cost_reference table is used for pulling a certain coefficient based on what is loaded in it at the current time. cost_reference 表用于根据当前加载的内容拉取某个系数。 The weighted_cost and normal_cost columns are calculated the subqueries shown. weighted_cost 和 normal_cost 列是根据显示的子查询计算的。 However, the multiple calls for MAX(ref_coefficient) really slows this query down.然而,多次调用 MAX(ref_coefficient) 确实减慢了这个查询的速度。 The whole thing takes about 3 seconds, but we would like to try to cut that time down since this is called somewhere where the data is refreshed constantly and the end result only has about 50 rows.整个过程大约需要 3 秒,但我们想尝试缩短该时间,因为这在数据不断刷新的地方被调用,最终结果只有大约 50 行。 Cost_reference usually contains about 2500 rows. Cost_reference 通常包含大约 2500 行。

The proper indexes are there as well, because when they weren't it took about 10 seconds.适当的索引也在那里,因为当它们不存在时,大约需要 10 秒。 If I take out these calculated columns, the query is instant.如果我取出这些计算列,查询是即时的。 I have tried either joining the cost_reference table or using a WITH statement for it, but I am not having any luck.我曾尝试加入 cost_reference 表或使用 WITH 语句,但我没有任何运气。 Is there any way to further optimize this query?有没有办法进一步优化这个查询?

Interesting Note: As we have Dell Toad for our SQL IDE, I ran a built in Optimization tool on it and noticed that one of the queries cut the time in half.有趣的提示:由于我们的 SQL IDE 有 Dell Toad,我在其上运行了一个内置的优化工具,并注意到其中一个查询将时间缩短了一半。 However upon inspecting it, all it did was put this comment in the normal_cost calculation.然而,在检查它时,它所做的只是将此注释放入 normal_cost 计算中。 If there aren't any optimization suggestions, does anyone know why this worked?如果没有任何优化建议,有谁知道为什么会这样? When I remove it, it goes back to its slower behavior.当我删除它时,它会恢复到较慢的行为。

SELECT /*+ FULL(COST_REFERENCE) */
      cost_1 + cost_2
 FROM cost_reference

Write the query using proper JOIN syntax.使用正确的JOIN语法编写查询。 I also recommend qualifying all column names, particular when using correlated subqueries.我还建议限定所有列名,尤其是在使用相关子查询时。

You can rewrite the second subquery so it uses aggregation.您可以重写第二个子查询,使其使用聚合。 And because sysdate has a time component, it is highly unlikely that your WHERE clause does what you expect.并且因为sysdate具有时间组件,所以您的WHERE子句不太可能符合您的预期。

So, I would rewrite the query as:因此,我将查询重写为:

SELECT pd.product_id, sd.column_1, sd.column_2,
       (SELECT ROUND(SUM(cost_1 + cost_2) / MAX(ref_coefficient), 4)
        FROM cost_reference cr
        WHERE cr.product_name = sd.product_name
       ) AS weighted_cost,
       (SELECT MAX(cr.cost_1 + cr.cost_2) KEEP (DENSE_RANK FIRST ORDER BY ref_coefficient DESC)
        FROM cost_reference cr
        WHERE cr.product_name = sd.product_name 
       ) AS normal_cost
FROM sales_data sd JOIN
     product_data pd
     ON sd.product_name = pd.product_name 
WHERE sd.date = TRUNC(SYSDATE)

For this query, you want the following indexes:对于此查询,您需要以下索引:

  • sales_data(date, product_name)
  • product_data(product_name)
  • cost_reference(product_name, ref_coefficient, cost_1, cost_2)

As you say that cost_reference has 2.5k rows, then you can try this (正如你所说的 cost_reference 有 2.5k 行,那么你可以试试这个(

SELECT pd.product_id, sd.column_1, sd.column_2,
    weighted_cost,
    normal_cost
FROM sales_data sd JOIN
    product_data pd
    ON sd.product_name = pd.product_name 
    join (select product_name, ROUND(SUM(cost_1 + cost_2) / MAX(ref_coefficient), 4) weighted_cost,
    MAX(cr.cost_1 + cr.cost_2) KEEP (DENSE_RANK FIRST ORDER BY ref_coefficient DESC) normal_cost 
    from cost_reference cr
    group by product_name) cr on (cr.product_name=sd.product_name)
WHERE sd.date = TRUNC(SYSDATE)

The idea is to have just one aggregation of cost_reference, and as it is small, the result set is also small.这个想法是只有一个 cost_reference 聚合,因为它很小,结果集也很小。 NB: did not check the sql in full with these tables.注意:没有用这些表完整检查 sql。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM