简体   繁体   English

如何编写一个MySQL查询,该查询将限制一个或多个联接表的结果,并计算一个或多个联接表中的项目数?

[英]How to write a MySQL query that will limit the results of a joined table or tables and also count the number of items in the joined table or tables?

How would you write a MySQL query that will limit the results of a joined table (or sub select if that works better) and also counts the number of items in the joined table or tables? 您将如何编写一个MySQL查询,该查询将限制联接表的结果(如果更好,则选择子查询),并且还对联接表中的项目数进行计数?

For instance, let's say you had three tables: projects, tasks and comments, where a project has 0 or more tasks and a task has 0 or more comments. 例如,假设您有三个表:项目,任务和注释,其中一个项目有0个或更多任务,而一个任务有0个或更多注释。 How would you limit the number of tasks returned per project to 3 and also return the total number of tasks per project and comments per task? 您如何将每个项目返回的任务数限制为3,又如何返回每个项目的任务总数和每个任务的注释?

Here's what I imagine the result set look like: 我想像的结果集如下所示:

project_id, project_title, task_id, task_title, num_tasks, num_comments
------------------------------------------------------------------------
1, Project1, 1, Task1, 4, 3
1, Project1, 2, Task2, 4, 0
1, Project1, 3, Task3, 4, 9
2, Project2, 10, Task10, 20, 0
2, Project2, 11, Task11, 20, 0
2, Project2, 12, Task12, 20, 2
3, Project3, 20, Task20, 17, 5
3, Project3, 21, Task21, 17, 1
3, Project3, 22, Task22, 17, 2

Where 'Project1', 'Project2', etc just represent a project's title and 'Task1', 'Task2', etc represent a task's title. 其中“ Project1”,“ Project2”等仅代表项目的标题,而“ Task1”,“ Task2”等则代表任务的标题。

Ultimately, (after parsing through the results of the query) I'd like to be able to display something like this: 最终,(在解析查询结果之后)我希望能够显示如下内容:

 Project1 (4 tasks)
     Task1 (3 comments)
     Task2 (0 comments)
     Task3 (9 comments)
 Project2 (20 tasks)
     Task10 (0 comments)
     Task11 (0 comments)
     Task12 (2 comments)
 Project3 (17 tasks)
     Task20 (5 comments)
     Task21 (1 comments)
     Task22 (2 comments)

I'm guessing this has to be done with sub selects (which is fine), but I can't seem to figure out how to accomplish this with just using joins and I don't quite have a good enough handle on sub selects to do something like this. 我猜想这必须通过子选择来完成(很好),但是我似乎无法弄清楚如何仅通过使用联接来完成此操作,而且我对子选择还没有足够的了解做这样的事情。

Honestly, I'd do this in multiple queries, to avoid the correlated subqueries. 老实说,我会在多个查询中执行此操作,以避免相关的子查询。

But here you go: 但是,您在这里:

SELECT p.project_id, p.project_title,
    t1.task_id, t1.task_title,
    (SELECT COUNT(*) FROM tasks t 
       WHERE t.project_id = p.project_id) AS num_tasks,
    COALESCE((SELECT COUNT(*) FROM comments c
       WHERE c.task_id = t1.task_id), 0) AS num_comments
FROM projects p
JOIN tasks t1 ON (p.project_id = t1.project_id)
LEFT OUTER JOIN tasks t2 
  ON (p.project_id = t2.project_id AND t1.task_id > t2.task_id)
GROUP BY t1.task_id
HAVING COUNT(*) < 3;

Consider that correlated subqueries like those above ( num_tasks and num_comments ) must execute many times -- once for each row of t1 . 考虑到像上面的相关子查询( num_tasksnum_comments必须执行多次-对于t1每一行一次。

You can get the results by running these queries separately and combining the results in your application code: 您可以通过分别运行这些查询并将结果合并到应用程序代码中来获得结果:

SELECT p.project_id, p.project_title,
    t1.task_id, t1.task_title
FROM projects p
JOIN tasks t1 ON (p.project_id = t1.project_id)
LEFT OUTER JOIN tasks t2 
  ON (p.project_id = t2.project_id AND t1.task_id > t2.task_id)
GROUP BY t1.task_id
HAVING COUNT(*) < 3;

SELECT task_id, COUNT(*) AS num_comments
FROM comments
WHERE task_id IN (...list of task_id values from first query...)
GROUP BY task_id;

SELECT project_id, COUNT(*) AS num_tasks
FROM tasks
GROUP BY project_id;

Even running three separate queries like this might be faster overall than running the more complex query that gets all the results together. 相对于运行将所有结果汇总在一起的更复杂的查询,甚至像这样运行三个单独的查询可能总体上也更快。 I say might because it depends on how much data we're talking about. 我之所以说是可能的,是因为这取决于我们正在讨论的数据量。 To be sure, you'd have to test both solutions using your own database. 可以肯定的是,您必须使用自己的数据库来测试这两种解决方案。


Re your followup question, I'd do this in a subquery: 关于您的后续问题,我将在子查询中执行此操作:

SELECT p.project_id, p.project_title,
    t1.task_id, t1.task_title
FROM (SELECT * FROM projects ORDER BY last_updated DESC LIMIT 5) p
. . .

Note this is not a correlated subquery; 请注意,这不是相关子查询; the RDBMS only has to do the subquery once. RDBMS只需要执行一次子查询。

I used DESC because I assume you want the most recent projects. 我之所以使用DESC是因为我假设您想要最新的项目。

I would say you'd have to use multiple queries and loops for something like this. 我会说,您必须对这样的事情使用多个查询和循环。
There may be a way, but its beyond the time I have :) 可能有办法,但是超出了我的时间:)
Here's some suedo code to show how I'd accomplish this 这是一些suedo代码,以显示如何完成此操作

select project_id, project_title from projects
select project_id, count(*) As num_tasks from tasks group by project_id
select task_id, count(*) As num_comment from comments group by task_id

foreach (int projectId in projects.Rows)
{
    select task_id, task_title from tasks where project_id = projectID limit 3
    foreach (int taskID in tasks.Rows)
    {
        select comment_id, comment from comments limit 3
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM