繁体   English   中英

谁能告诉我哪一个会在 SQL 中的数百万行数据中表现得更好

[英]Can anybody tell me which one will perform better with millions of rows of data in SQL

在 SQL 中哪个好:

  1. 公共表愤怒(CTE)
  2. 临时表
  3. 变量表

当我们的表中有 10000000 条记录并且需要子查询 offetch 记录时:-

WITH cteData AS 
(
    SELECT 
        Product_Id, Variant_Id, Variant_Name, Unit_Price,
        (SELECT GST FROM Product_Details P with(nolock) 
         WHERE V.Product_Id = P.Product_Id) AS GST 
    FROM 
        Variant_Details V with(nolock)
)
SELECT 
    Product_Id, Variant_Id, Variant_Name, 
    SUM(Unit_Price * GST) AS Variant_Total_Price,
    (SELECT SUM(C.Unit_Price * C.GST) FROM cteData C 
     WHERE CD.Product_Id = C.Product_Id) AS Product_Total_Price
FROM 
    cteData CD
GROUP BY 
    Product_Id, Variant_Id, Variant_Name

或者:

SELECT 
    Product_Id, Variant_Id, Variant_Name, Unit_Price,
    (SELECT GST FROM Product_Details P with(nolock) 
     WHERE V.Product_Id = P.Product_Id) AS GST 
INTO 
    #tempData 
FROM 
    Variant_Details V with(nolock)

SELECT 
    Product_Id, Variant_Id, Variant_Name, 
    SUM(Unit_Price * GST) AS Variant_Total_Price,
    (SELECT SUM(C.Unit_Price * C.GST) FROM #tempData C 
     WHERE CD.Product_Id = C.Product_Id) AS Product_Total_Price
FROM 
    #tempData CD
GROUP BY 
    Product_Id, Variant_Id, Variant_Name

在这两种情况下 - 当表中有数百万条记录时,哪一种更好?

与所有性能问题一样,您可能应该尝试使用您的硬件和数据来测试您的数据库。

也就是说,我会像这样编写查询:

SELECT v.Product_Id, v.Variant_Id, v.Variant_Name,
       SUM(v.Unit_Price * p.GST) as variant_total_price,
       SUM(SUM(v.Unit_Price * p.GST)) OVER (PARTITION BY v.product_id) as variant_total_price,
FROM Variant_Details V LEFT JOIN
     Product_Details P 
     ON V.Product_Id = P.Product_Id
GROUP BY Product_Id, Variant_Id, Variant_Name;

使用正确的索引,我希望这比其他替代方案更快。

考虑使用连接和窗口函数而不是 CTE 和子查询。 这应该可以满足您的需求,并且比您要比较的两种解决方案表现更好:

select v.product_id, v.variant_id, v.variant_name, v.unit_price, pd.gst
    sum(pd.unit_price * pd.gst) variant_total_price,
    sum(sum(pd.unit_price * pd.gst)) over(partition by v.product_id) product_total_price
from variant_details v
left join product_details pd on pd.product_id = v.product_id
group by v.product_id, v.variant_id, v.variant_name

我可以看到您还在查询中使用了group by 当考虑数百万条记录分组时肯定会降低性能。

我建议使用Outer Apply解决方案。 这是它的代码:

    SELECT v.Product_Id, v.Variant_Id, v.Variant_Name,
       val.variant_total_price,
       val.variant_total_price OVER (PARTITION BY v.product_id) as product_total_price,
    FROM Variant_Details v 
    OUTER APPLY
    (
      SELECT SUM(v.Unit_Price * p.GST) as variant_total_price
      FROM Product_Details p where v.Product_Id = p.Product_Id
    ) val

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM