[英]how can this query be optimized (n+1 and dense_rank after)
如何優化此查詢
WITH stats AS (SELECT a.IntegratorSalesAssociateID,
a.AgentName,
(
SELECT COUNT(*)
FROM properties AS p
WHERE a.IntegratorSalesAssociateID = p.IntegratorSalesAssociateID
AND p.TransactionType = '2'
AND MONTH(p.OrigListingDate) = MONTH(CURRENT_DATE)
AND YEAR(p.OrigListingDate) = YEAR(CURRENT_DATE)
) AS properties_this_month
FROM agents AS a)
SELECT stats.*,
DENSE_RANK() over (ORDER BY stats.properties_this_month DESC) AS 'rank'
from stats
我想也許如果我加入這兩個表並以某種方式對它們進行分組,它會執行得更好,目前它運行 17.5 秒,奇怪的是,添加dense_rank 根本不會影響性能。
相關表結構
CREATE TABLE `agents`
(
`IntegratorSalesAssociateID` varchar(15) COLLATE utf8mb4_unicode_ci NOT NULL,
`AgentName` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL
) ENGINE = InnoDB
DEFAULT CHARSET = utf8mb4
COLLATE = utf8mb4_unicode_ci;
CREATE TABLE `properties`
(
`id` bigint(20) UNSIGNED NOT NULL,
`IntegratorSalesAssociateID` varchar(13) COLLATE utf8mb4_unicode_ci NOT NULL,
`TransactionType` tinyint(4) NOT NULL,
`OrigListingDate` date DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL
) ENGINE = InnoDB
DEFAULT CHARSET = utf8mb4
COLLATE = utf8mb4_unicode_ci;
你可以試試這個:
;WITH stats AS
(
SELECT
p.IntegratorSalesAssociateID
, COUNT(*) AS properties_this_month
FROM properties AS p
WHERE p.TransactionType = '2'
AND MONTH(p.OrigListingDate) = MONTH(CURRENT_DATE)
AND YEAR(p.OrigListingDate) = YEAR(CURRENT_DATE)
GROUP BY p.IntegratorSalesAssociateID
)
SELECT
a.IntegratorSalesAssociateID
, a.AgentName
, COALESCE(s.properties_this_month, 0) AS properties_this_month
FROM agents AS a
LEFT JOIN stats s ON a.IntegratorSalesAssociateID = s.IntegratorSalesAssociateID
鑒於DENSE_RANK()
不會影響性能,您需要優化:
SELECT a.IntegratorSalesAssociateID,
a.AgentName,
(SELECT COUNT(*)
FROM properties p
WHERE a.IntegratorSalesAssociateID = p.IntegratorSalesAssociateID AND
p.TransactionType = '2' AND
MONTH(p.OrigListingDate) = MONTH(CURRENT_DATE) AND
YEAR(p.OrigListingDate) = YEAR(CURRENT_DATE)
) AS properties_this_month
FROM agents a;
我會將其重寫為:
SELECT a.IntegratorSalesAssociateID,
a.AgentName,
(SELECT COUNT(*)
FROM properties p
WHERE a.IntegratorSalesAssociateID = p.IntegratorSalesAssociateID AND
p.TransactionType = 2 AND
p.OrigListingDate >= CURRENT_DATE - INTERVAL (1 - DAY(CURRENT_DATE) DAY
) AS properties_this_month
FROM agents a;
這兩個變化是:
TransactionType
看起來像一個數字。 假設是,我刪除了單引號。 不要混合數據類型,當然,如果列是字符串。 然后使用單引號。 然后,對於此查詢,您需要一個索引: properties(IntegratorSalesAssociateID, TransactionType, OrigListingDate)
。 實際上,該索引可能適用於數據的原始版本。
我真誠地懷疑使用顯式聚合會提高性能。 GROUP BY
雖然非常強大——通常比相關子查詢慢。 並且使用正確的索引幾乎總是更慢(或至少不是更快)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.