[英]How to optimize SQL query which returns millions of records
All the example used below is currently for only one Salesorganization and we may have more different salesOrganization number later in future. 下面使用的所有示例目前仅适用于一个SalesOrganization,将来以后我们可能会有更多不同的salesOrganization编号。
I have 6 tables which has millions of records. 我有6张桌子,上面有数百万条记录。 This tables are populated by execution of SSIS package.
通过执行SSIS包来填充此表。
select count(*) from tmp_materials --11,02,032
select count(*) from tbl_VendorLogoData --20,41,501
select count(*) from TBL_Image_EDV --4,44,063
select count(*) from TBL_EXTPRODUCTATTRIBUTES_EDV -- 2,06,15,572
select count(*) from TBL_Accessories_EDV --10,11,568
select count(*) from TBL_SimilarSku --64,10,408
I have a stored procedure which is used to select distinct records from these tables 我有一个存储过程,用于从这些表中选择不同的记录
SELECT DISTINCT 'MD' AS [COMPANYCD]
, MAIN.MATERIAL AS [MATERIAL]
, ISNULL(UPPER(IMG.[lowprovider]),'') AS [LOW_IMAGE]
, ISNULL(UPPER(IMG.[midprovider]), '') AS [MID_IMAGE]
, ISNULL(UPPER(IMG.[highprovider]), '') AS [HIGH_IMAGE]
, ISNULL(UPPER(VS.isLogo),'') AS [VENDOR_LOGO]
, ISNULL(UPPER(DS.PROVIDER), '') AS [DATASHEET]
, ISNULL(ACC.AccessoryMaterial, '') AS [OPT_ACC]
, ISNULL(UPPER(ACC.Provider), '') AS [OPT_ACC_PROVIDER]
, ISNULL(SS.Similarsku, '') AS [SIMILAR_SKU]
, ISNULL(UPPER(SS.ProviderName), '') AS [SIMILAR_SKU_PROVIDER]
FROM tmp_materials MAIN WITH (NOLOCK)
LEFT OUTER JOIN TBL_Image_EDV IMG WITH (NOLOCK) ON MAIN.MATERIAL = IMG.MATERIAL AND MAIN.salesOrg = IMG.SalesOrganization
LEFT OUTER JOIN TBL_EXTPRODUCTATTRIBUTES_EDV DS WITH (NOLOCK) ON MAIN.MATERIAL = DS.SKUNBR AND MAIN.salesOrg = DS.SalesOrganization
LEFT OUTER JOIN TBL_Accessories_EDV ACC WITH (NOLOCK) ON MAIN.MATERIAL = ACC.ParentSKU AND MAIN.salesOrg = DS.SalesOrganization
LEFT OUTER JOIN TBL_SimilarSku SS WITH (NOLOCK) ON MAIN.MATERIAL = SS.ParentSKU AND MAIN.salesOrg = DS.SalesOrganization
LEFT OUTER JOIN tbl_VendorLogoData VS WITH (NOLOCK) ON MAIN.MATERIAL = VS.SKU AND MAIN.salesOrg = VS.Salesorganization
WHERE MAIN.salesOrg = @SALESORGANIZATION
AND (CASE WHEN IMG.MATERIAL IS NULL AND DS.SKUNBR IS NULL AND ACC.ParentSKU IS NULL AND SS.ParentSKU IS NULL AND VS.SKU IS NULL
THEN 0 ELSE 1 END) = 1
AND DS.Provider <> 'novalue'
AND SS.RecordIdentifier like '%@@%'
AND ACC.RecordIdentifier like '%@@%'
AND ACC.accessorySku LIKE '%@@%'
The parameter for these procedure is @SALESORGANIZATION
This is used to populate a report. 这些过程的参数为
@SALESORGANIZATION
此参数用于填充报告。 I am running this for multiple salesorganization values. 我正在为多个salesorganization值运行它。 But for one of the salesorganization it is taking more than 5 hours to generate the data.
但是对于一个销售组织来说,要花费超过5个小时才能生成数据。
It seems i will need to write a loop, but finding it difficult to proceed with multiple joins any suggestions? 看来我需要编写一个循环,但是发现很难进行多次连接有什么建议吗?
Please advice how can i optimize this query? 请咨询我如何优化此查询? Thanks for your assitance.
感谢您的协助。
There you have an execution plan file SQL Execution Plan 那里有一个执行计划文件SQL Execution Plan
Try to include more condition to improve performance. 尝试包含更多条件以提高性能。
In all the LEFT OUTER JOIN checks include your parameter directly and try to avoid JOINDE tables using there eg 在所有LEFT OUTER JOIN检查中,都直接包含您的参数,并尝试避免在那里使用JOINDE表,例如
LEFT OUTER JOIN TBL_Image_EDV IMG WITH (NOLOCK) ON
MAIN.MATERIAL = IMG.MATERIAL
AND MAIN.salesOrg = IMG.SalesOrganization
AND IMG.SalesOrganization=@SALESORGANIZATION
Add conditions in the WHERE 在WHERE中添加条件
AND SS.RecordIdentifier like '%@@%'
should be 应该
(SS.id IS NOT NULL AND SS.RecordIdentifier like '%@@%')
Thus you first cut the rows with no relation 因此,您首先剪切没有关系的行
UPDATE UPDATE
One more idea 还有一个主意
Try 尝试
INNER JOIN (select *
from TBL_Accessories_EDV
where RecordIdentifier like '%@@%'
AND accessorySku LIKE '%@@%') ACC WITH (NOLOCK)
ON MAIN.MATERIAL = ACC.ParentSKU AND MAIN.salesOrg = DS.SalesOrganization
Just to restrict amout of records you filter out later 只是为了限制记录的数量,以后再过滤掉
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.