[英]MySQL Slow query with multiple joins and subqueries
我有3张桌子:
基本上,下载图像并在Pidl中创建日志记录。 之后,调整大小并在Pirl中创建一条记录。 所述记录被连接到Pidl记录。
我正在写一个查询,以查找需要调整大小的图像,它基本上是在查询Pidl。 我设计的算法很简单:
for each Image in Pi {
pidlA=newest_pidl(Image);
if(pidlA.status == success) {
pirlA=newest_pirl(Image);
if(pirlA.pidl.hash != pidlA.hash)
{
go;
}
else if(pirlA.status != success){
failed_attempts = failed_pirl_count(pirlA,newest_succesful_pirl(Image))
decide based on pirlA.time and failed_attempts if go or not
}
else
{
dont go;
}
}
else
{
dont go;
}
}
现在,我的查询(虽然尚未完成,但失败的尝试部分仍然缺少,但是它已经太慢了,所以首先我想解决该问题)。
SELECT
pidl1A.pidl_id
FROM Pidl as pidl1A
LEFT JOIN Pidl as pidl2A
ON (
pidl1A.pidl_pi_id = pidl2A.pidl_pi_id AND
pidl2A.pidl_status = 1 AND
(pidl2A.pidl_time > pidl1A.pidl_time OR
(pidl2A.pidl_id > pidl1A.pidl_id and pidl1A.pidl_time=pidl2A.pidl_time)
)
)
LEFT JOIN (
#newest pirl subquery#
SELECT
pidl1B.pidl_pi_id as sub_pi_id,
pidl1B.pidl_hash as sub_pidl_hash,
pirl1B.pirl_id as sub_pirl_id,
pirl1B.pirl_status as sub_pirl_status
FROM Pirl as pirl1B
INNER JOIN Pidl as pidl1B ON (pirl1B.pirl_pidl_id = pidl1B.pidl_id)
LEFT JOIN (
SELECT
pidl2B.pidl_pi_id as sub_pi_id,
pirl2B.pirl_id as sub_pirl_id,
pirl2B.pirl_time as sub_pirl_time
FROM Pirl as pirl2B
INNER JOIN Pidl as pidl2B ON (pirl2B.pirl_pidl_id = pidl2B.pidl_id)
WHERE 1
) as pirl3B
ON (
pirl3B.sub_pi_id = pidl1B.pidl_pi_id and
(pirl3B.sub_pirl_time > pirl1B.pirl_time or
(pirl3B.sub_pirl_time = pirl1B.pirl_time and
pirl3B.sub_pirl_id > pirl1B.pirl_id)
)
)
WHERE
pirl3B.sub_pirl_id is null
) as pirl1A
ON (pirl1A.sub_pi_id = pidl1A.pidl_pi_id)
WHERE
pidl1A.pidl_status = 1 AND pidl2A.pidl_id IS NULL
AND (
pirl1A.sub_pirl_id IS NULL
OR (
pidl1A.pidl_hash != pirl1A.sub_pidl_hash
)
OR (
pirl1A.sub_pirl_status != 1
)
)
这是我的数据库模式:
CREATE TABLE Pi (
`pi_id` int,
PRIMARY KEY (`pi_id`)
)
;
CREATE TABLE Pidl
(
`pidl_id` int,
`pidl_pi_id` int,
`pidl_status` int,
`pidl_time` int,
`pidl_hash` varchar(16),
PRIMARY KEY (`pidl_id`)
)
;
alter table Pidl
add constraint fk1_branchNo foreign key (pidl_pi_id) references Pi (pi_id);
CREATE TABLE Pirl
(
`pirl_id` int,
`pirl_pidl_id` int,
`pirl_status` int,
`pirl_time` int,
PRIMARY KEY (`pirl_id`)
)
;
alter table Pirl
add constraint fk2_branchNo foreign key (pirl_pidl_id) references Pidl (pidl_id);
INSERT INTO Pi
(`pi_id`)
VALUES
(3),
(4),
(5);
INSERT INTO Pidl
(`pidl_id`, `pidl_pi_id`,`pidl_status`,`pidl_time`, `pidl_hash`)
VALUES
(1, 3, 1,100, 'hashA'),
(2, 3, 1,150,'hashB'),
(3, 4, 2, 200,'hashC'),
(4, 3, 1, 200,'hashA')
;
INSERT INTO Pirl
(`pirl_id`, `pirl_pidl_id`,`pirl_status`,`pirl_time`)
VALUES
(1, 2, 0,100),
(2, 3, 1,150),
(3, 1, 2, 200)
;
当然,有了3条记录,它很快。 但是大约需要10-30k,这需要5秒钟以上的时间。 我发现,使速度变慢的是其中的最后一部分:
AND (
pirl1A.sub_pirl_id IS NULL
OR (
pidl1A.pidl_hash != pirl1A.sub_pidl_hash
)
OR (
pirl1A.sub_pirl_status != 1
)
)
我发现的另一件奇怪的事情是,通过使用DISTINCT,查询速度提高了一些,但还不够快。
在阅读您的要求时,我会提出这样的查询:
select pidl.*
from pidl left join
(select image, max(pidl_time) as pidl_time
from pidl
group by image
) maxpidl
on pidl.image = maxpidl.image and pidl.pidl_time = maxpidl.pidl_time
pirl
on pidl.hash = pirl.hash
where pirl.hash is null;
我认为您还有一些其他条件尚未完全说明(例如,地位的作用)。 您应该能够合并。
在MySQL中,应避免from
子句中的子查询。 这些实现了,因此,该工作需要额外的开销,并且引擎随后无法使用索引。
您的查询不使用索引,而是在子查询中使用视图。 这可能很慢。 我建议创建新表,以所需的信息或物化视图作为索引。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.