[英]Convert NOT IN query to better performance
I'm using MySQL 5.0, and I need to fine tune this query. 我正在使用MySQL 5.0,并且需要微调此查询。 Can anyone please tell me what tuning I can do in this? 谁能告诉我在此方面可以做哪些调整?
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
SELECT DISTINCT(alert_master_id) FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
SELECT DISTINCT(alert_master_id) FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL) AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
The first thing that I'd do is rewrite the subqueries as joins : 我要做的第一件事是将子查询重写为joins :
SELECT h.alert_master_id
FROM alert_appln_header h
JOIN schedule_config c
ON c.schedule_name = 'Purging_Config'
LEFT JOIN alert_details d
ON d.alert_master_id = h.alert_master_id
AND d.end_date IS NULL
AND d.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
LEFT JOIN (
alert_sara_header s
JOIN alert_sara_lines l
ON l.alert_sara_master_id = s.sara_master_id
)
ON s.alert_master_id = h.alert_master_id
AND s.end_date IS NULL
AND s.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
WHERE h.created_date < CURRENT_DATE - INTERVAL c.parameters DAY
AND d.alert_master_id IS NULL
AND s.alert_master_id IS NULL
GROUP BY h.alert_master_id
LIMIT 5000
If it's still slow after that, re-examine your indexing strategy. 如果在那之后仍然很慢,请重新检查您的索引编制策略。 I'd suggest indexes over: 我建议索引超过:
alert_appln_header(alert_master_id,created_date)
schedule_config(schedule_name)
alert_details(alert_master_id,end_date,created_date)
alert_sara_header(sara_master_id,alert_master_id,end_date,created_date)
alert_sara_lines(alert_sara_master_id)
OK, this may be just a shot in the dark, but I think you don't need as many DISTINCT
here. 好的,这可能只是黑暗中的一枪,但我认为您在这里不需要那么多DISTINCT
。
SELECT DISTINCT(alert_master_id) FROM alert_appln_header
WHERE created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
AND alert_master_id NOT IN (
-- removed distinct here --
SELECT alert_master_id FROM alert_details
WHERE end_date IS NULL AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
UNION
-- removed distinct here --
SELECT alert_master_id FROM alert_sara_header
WHERE sara_master_id IN
(SELECT alert_sara_master_id FROM alert_sara_lines
WHERE end_date IS NULL)
AND created_date < DATE_SUB(CURDATE(), INTERVAL (SELECT parameters FROM schedule_config WHERE schedule_name = "Purging_Config") DAY)
) LIMIT 5000;
Since using the DISTINCT
is very costly, try to avoid it. 由于使用DISTINCT
的成本很高,因此请避免使用它。 In the first WHERE
clause you are checking for ids
that are NOT
within some result , so it shouldn't matter if in that result some ids
appear more than once. 在第一个WHERE
子句中要检查的ids
是NOT
有些结果中,所以如果该结果一定是不应该的问题ids
出现不止一次。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.