SQL逻辑：两个表（分组依据？）

Question

I have two tables, a table of movies (mt, the primary table), and a table listing where those movies are available (st, the secondary table). 我有两个表，一个电影表（mt，主表），和一个列出这些电影可用的表（st，第二个表）。 For each mt record, there are multiple st records. 对于每个mt记录，都有多个st记录。 I need a query which will join the records, and allow me to run queries on both tables. 我需要一个查询，该查询将加入记录，并允许我在两个表上运行查询。 I currently use an inner join like this: 我目前使用这样的内部联接：

SELECT * FROM
(
    SELECT ROW_NUMBER() OVER(ORDER BY " + orderField + @") AS RowNum,
           mt.ID AS mt_ID,
           mt.title AS mt_title,
           [...]
           st.ID AS st_ID,
           st.title AS st_title,
           [...]
    FROM mt AS mt 
    st AS st
    INNER JOIN sttable AS st on mt.ID =st.ID
    WHERE st.title=@variable <> 0 AND mt.title = @variable
)    
    as DerivedTableName
    WHERE RowNum between 
    ((@pageIndex - 1) * @pageSize + 1) and @pageIndex*@pageSize

The problem with this is that I need to be able to loop through mt using RowNum and pageIndex, and the query returns more than one record for each movie (for example, if there are 8 records for a particular movie, 8, not 1, records are returned). 问题是我需要使用RowNum和pageIndex遍历mt，并且查询为每部电影返回多个记录（例如，如果某部电影有8条记录，则为8条，而不是1条）记录返回）。 I have tried using GROUP BY, but the problem with that is that it will not allow me to perform queries on fields in the subordinate table (st). 我曾尝试使用GROUP BY，但问题是它不允许我对下级表（st）中的字段执行查询。

Any help as to the appropriate logic would be very much appreciated. 任何有关适当逻辑的帮助将不胜感激。

Answer 1

Here's a solution that does a GROUP BY in the derived table subquery so you get only one row per movie title, and calculate the ROW_NUMBER() from that. 这是一个在派生表子查询中执行GROUP BY的解决方案，因此每个电影标题仅获得一行，并由此计算ROW_NUMBER() 。

Then join the result of the derived table to st again in the outer query. 然后将派生表的结果再次连接到外部查询中的st 。 There will still be multiple rows per movie title, but the RowNum will repeat so you can filter for your @pageIndex correctly. 每个电影标题仍将有多行，但是RowNum将重复，因此您可以正确过滤@pageIndex 。

SELECT * FROM
(
    SELECT ROW_NUMBER() OVER(ORDER BY " + orderField + @") AS RowNum,
           mt.ID AS mt_ID,
           mt.title AS mt_title,
           [...]
    FROM mt AS mt 
    INNER JOIN sttable AS st ON mt.ID =st.ID
    WHERE mt.title = @variable
    GROUP BY mt.ID
) mt1
INNER JOIN sttable AS st1 ON (mt1.ID = st1.ID)
WHERE mt1.RowNum BETWEEN 
    ((@pageIndex - 1) * @pageSize + 1) AND @pageIndex*@pageSize;

You'll have to loop over the result of the outer query in your display code and begin a new row of output whenever the value RowNum changes. 每当值RowNum更改时，就必须在显示代码中循环外部查询的结果，并开始新的输出行。 This is a pretty obvious technique. 这是一个非常明显的技术。

If you instead want to do something like MySQL's GROUP_CONCAT() function, this is tricky in Microsoft SQL Server (I assume you're using Microsoft). 如果您想做类似MySQL的GROUP_CONCAT()函数的操作，那么在Microsoft SQL Server中这很棘手（我假设您使用的是Microsoft）。

See blogs like http://blog.shlomoid.com/2008/11/emulating-mysqls-groupconcat-function.html that describe use of the FOR xml PATH ('') trick. 请参阅http://blog.shlomoid.com/2008/11/emulation-mysqls-groupconcat-function.html之类的博客，这些博客描述了FOR xml PATH ('')技巧的用法。

PS: Your sample SQL query didn't make sense given your verbal description, so I did my best to write something sensible. PS：鉴于您的口头描述，您的示例SQL查询没有任何意义，因此我尽力写出了一些有意义的文章。 No guarantees it matches your schema. 不保证它与您的架构匹配。

Re your comment: I don't think you need to order by any column of st inside the subquery. 发表您的评论：我认为您不需要按子查询中st的任何列进行排序。 I intended there to be no columns from st included in the select-list of the subquery. 我打算在子查询的选择列表中不包含st中的列。 The only reason there's an instance of st joined in the subquery is to restrict the rows of mt to those that have matching rows in st . 在子查询中加入st的实例的唯一原因是将mt的行限制为在st中具有匹配行的行。 But the GROUP BY mt.ID makes sure there's only one row per row of mt (actually since this is SQL Server and not MySQL you'll need to name all the mt columns of the select-list in the GROUP BY clause). 但是GROUP BY mt.ID确保每行mt只有一行（实际上，因为这是SQL Server而不是MySQL，因此您需要在GROUP BY子句中命名select-list的所有mt列）。

Re your second comment: 重新发表您的第二条评论：

I want to first display mt rows that have corresponding st records that have been most recently added 我想首先显示具有最新添加的对应st记录的mt行

You can add other columns to the grouped query if you use grouping functions. 如果使用分组功能，则可以将其他列添加到分组查询中。 For instance, the latest date_added per group of mt is MAX(st.date_added) , and you can add this column to the subquery. 例如，每组mt的最新date_added为MAX(st.date_added) ，您可以将此列添加到子查询中。

However, don't use ORDER BY in the subquery. 但是，请勿在子查询中使用ORDER BY 。 There's seldom any reason to sort a subquery, since the order may be altered anyway by using the subquery result in the JOIN or other operations you'd use a subquery in. 几乎没有任何理由对子查询进行排序，因为通过在JOIN中使用子查询结果或您将在其中使用子查询的其他操作，无论如何顺序都可以更改。

You should sort in the outer query: 您应该对外部查询进行排序：

SELECT * FROM
(
    SELECT ROW_NUMBER() OVER(ORDER BY " + orderField + @") AS RowNum,
           mt.ID AS mt_ID,
           mt.title AS mt_title,
           [...] -- other mt.* columns
           MAX(st.date_added) AS latest_date_added
    FROM mt AS mt 
    INNER JOIN sttable AS st ON mt.ID =st.ID
    WHERE mt.title = @variable
    GROUP BY mt.ID, -- other mt.* columns
) mt1
INNER JOIN sttable AS st1 ON (mt1.ID = st1.ID)
WHERE mt1.RowNum BETWEEN 
    ((@pageIndex - 1) * @pageSize + 1) AND @pageIndex*@pageSize
ORDER BY mt1.latest_date_added DESC, st1.date_added DESC;

Answer 2

I don't think you need to have both this: 我认为您不需要同时拥有这两个功能：

FROM mt AS mt 
  st AS st

and this: 和这个：

INNER JOIN sttable AS st on mt.ID = st.ID

SQL逻辑：两个表（分组依据？）

问题描述

2 个解决方案

解决方案1
1 已采纳 2009-11-25 00:28:20

解决方案2
0 2009-11-25 00:23:57

SQL逻辑：两个表（分组依据？）

问题描述

2 个解决方案

解决方案1 1 已采纳 2009-11-25 00:28:20

解决方案2 0 2009-11-25 00:23:57

解决方案1
1 已采纳 2009-11-25 00:28:20

解决方案2
0 2009-11-25 00:23:57