[英]how to avoid nested subqueries in SQL
I've just added a tagging system to my website and I'm trying to figure out the most efficient way to run scalable queries.我刚刚在我的网站上添加了一个标记系统,我正在尝试找出运行可扩展查询的最有效方法。 Here's a basic working mysql query to return tag matches for a given user:
这是一个基本的工作 mysql 查询,用于返回给定用户的标签匹配:
SELECT
scans.scan_index,
scans.scan_id,
scans.archive_folder
FROM
tags
INNER JOIN
interpretationtags USING (tagid)
INNER JOIN
interpretations USING (interpretation_id)
INNER JOIN
scans
ON scans.scan_id = interpretations.scan_id
AND scans.archive_folder = interpretations.archive_folder
INNER JOIN
archives
ON scans.archive_folder = archives.archive_folder
WHERE
archives.user_id = "google-authd...."
AND tags.tag = "tag1"
But it gets sticky when I want to query multiple tags
for the same scan
.但是当我想为同一个
scan
查询多个tags
时,它会变得很棘手。 You see, tags
are present in different interpretations,
and there are multiple interpretations for each scan.
你看,
tags
有不同的interpretations,
每次scan.
Here's a working query for two tags
using a subquery:这是使用子查询对两个
tags
的有效查询:
SELECT
a.scan_index,
a.scan_id,
a.archive_folder
FROM
(
SELECT
scans.scan_index,
scans.scan_id,
scans.archive_folder
FROM
tags
INNER JOIN
interpretationtags USING (tagid)
INNER JOIN
interpretations USING (interpretation_id)
INNER JOIN
scans
ON scans.scan_id = interpretations.scan_id
AND scans.archive_folder = interpretations.archive_folder
INNER JOIN
archives
ON scans.archive_folder = archives.archive_folder
WHERE
archives.user_id = "google-auth2..."
AND tags.tag = "tag1"
)
as a
INNER JOIN
interpretations
ON a.scan_id = interpretations.scan_id
AND a.archive_folder = interpretations.archive_folder
INNER JOIN
interpretationtags USING(interpretation_id)
INNER JOIN
tags USING(tagid)
WHERE
tags.tag = "tag2"
Since this is running on a LAMP stack, I've written some PHP code to iterate over the tags
I'd like to include in this AND-style search, building a multi-nested query.由于这是在 LAMP 堆栈上运行的,因此我编写了一些 PHP 代码来迭代我想包含在此 AND 样式搜索中的
tags
,构建一个多嵌套查询。 Here's one with three这是一个与三个
SELECT
b.scan_index,
b.scan_id,
b.archive_folder
FROM
(
SELECT
a.scan_index,
a.scan_id,
a.archive_folder
FROM
(
SELECT
scans.scan_index,
scans.scan_id,
scans.archive_folder
FROM
tags
INNER JOIN
interpretationtags USING (tagid)
INNER JOIN
interpretations USING (interpretation_id)
INNER JOIN
scans
ON scans.scan_id = interpretations.scan_id
AND scans.archive_folder = interpretations.archive_folder
INNER JOIN
archives
ON scans.archive_folder = archives.archive_folder
WHERE
archives.user_id = "google..."
AND tags.tag = "tag1"
)
as a
INNER JOIN
interpretations
ON a.scan_id = interpretations.scan_id
AND a.archive_folder = interpretations.archive_folder
INNER JOIN
interpretationtags USING(interpretation_id)
INNER JOIN
tags USING(tagid)
WHERE
tags.tag = "tag2"
)
as b
INNER JOIN
interpretations
ON b.scan_id = interpretations.scan_id
AND b.archive_folder = interpretations.archive_folder
INNER JOIN
interpretationtags USING(interpretation_id)
INNER JOIN
tags USING(tagid)
WHERE
tags.tag = "tag3"
Even 4 nested subqueries runs fast with minimal data, but I just don't see this being a scalable solution when I'm dealing with 100k rows of data.即使是 4 个嵌套子查询也能以最少的数据快速运行,但当我处理 100k 行数据时,我只是不认为这是一个可扩展的解决方案。 How can I accomplish this without reverting to this ugly inefficient code?
我怎样才能在不恢复到这个丑陋的低效代码的情况下做到这一点?
It's hard to be certain without table structures and sample data, but I think you're going about this in the wrong direction.没有表结构和示例数据很难确定,但我认为你的方向是错误的。 You should start from scans and find all the appropriate tags, and then filter on those (which should then be a simple
IN
expression):您应该从扫描开始并找到所有适当的标签,然后过滤这些标签(这应该是一个简单的
IN
表达式):
SELECT
scans.scan_index,
scans.scan_id,
scans.archive_folder
FROM
scans
INNER JOIN
archives
ON scans.archive_folder = archives.archive_folder
INNER JOIN
interpretations
ON scans.scan_id = interpretations.scan_id
AND scans.archive_folder = interpretations.archive_folder
INNER JOIN
interpretationtags USING (interpretation_id)
INNER JOIN
tags USING (tagid)
WHERE
archives.user_id = "google-authd...."
AND tags.tag IN("tag1", "tag2")
Note that based on your SELECT
field list I don't think you actually need to JOIN
to archives
at all.请注意,根据您的
SELECT
字段列表,我认为您实际上根本不需要JOIN
archives
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.