简体   繁体   English

使用GROUP_CONCAT

[英]Using GROUP_CONCAT

I have three tables. 我有三张桌子。 TB_Main is a table of Entities. TB_Main是实体表。 TB_BoardMembers is a table of People. TB_BoardMembers是一个人的表。 TB_BoardMembersLINK is a bridging table which references the other two by ids and also has start and end dates for when a Person was on the board of an Entity. TB_BoardMembersLINK是一个桥接表,它通过ID引用其他两个表,并且还具有某人在实体董事会上的开始和结束日期。 These dates are often incomplete. 这些日期通常不完整。

I have been asked to export as part of a report a CSV with one row per Entity per year in which I have a list of board members for that year with their occupations in a single field delimited by newlines. 我被要求导出一份CSV格式的报告作为报告的一部分,每个实体每年一行,其中我列出了当年的董事会成员列表,他们的职业在换行符分隔的单个字段中。

I don't need bml.Entity in the result but added it to try to debug. 我不需要在结果中使用bml.Entity,但将其添加到调试中。 I'm getting one row where I expect 85. Tried with and without GROUP BY and the fact that the result is the same suggests I am misusing GROUP_CONCAT. 我在期望85的位置得到了一行。在有和没有GROUP BY的情况下进行了尝试,结果相同的事实表明我在滥用GROUP_CONCAT。 How should I construct this to get the result they want? 我应该如何构造它以获得他们想要的结果?

SELECT 
GROUP_CONCAT(
DISTINCT CONCAT(bm.First, ' ', bm.Last, 
IF (bm.Occupation != '', ' - ', ''),
bm.Occupation)  SEPARATOR "\n") as Board,
bml.Entity
FROM  
TB_Main arfe,
TB_BoardMembers  bm,
TB_BoardMembersLINK  bml
WHERE YEAR(bml.start) <= 2011 
AND (YEAR(bml.end) >= 2011 OR bml.end IS NULL)
AND bml.start > 0 
AND bml.Entity = arfe.ID
GROUP BY bml.Entity
ORDER BY Board

There are a few issues with this query. 此查询存在一些问题。 The main issue appears to be that you are missing a condition to link board members to the link table, so you have a cross join, ie you will be returning every broadband member regardless of their start/end dates, and assuming you have 85 rows where the criteria matches, you will actually be returning each board member 85 times. 主要问题似乎是您缺少将董事会成员链接到链接表的条件,因此您进行了交叉联接,即您将返回每个宽带成员,无论其开始/结束日期如何,并假设您有85行如果符合条件,您实际上将退回每个董事会成员85次。 This highlights a very good reason to switch from the ANSI 89 implicit joins you are using, to the ANSI 92 explicit join syntax. 这突出了一个很好的理由,即将您正在使用的ANSI 89隐式连接切换为ANSI 92显式连接语法。 This article highlights some very good reasons to make the switch. 本文重点介绍了进行切换的一些很好的理由。

So your query would become (I've had to guess at your field names): 因此,您的查询将变为(我不得不猜测您的字段名称):

SELECT  *
FROM    TB_Main arfe
        INNER JOIN TB_BoardMembersLINK  bml
            ON bml.Entity = arfe.ID
        INNER JOIN TB_BoardMembers  bm
            ON bm.ID = bml.BoardMemberID

The next thing I noticed about your query is that using functions in the where clause is not very efficient at all, so because of this: 关于您的查询,我注意到的第二件事是,在where子句中使用函数根本不是很有效,因此:

WHERE   YEAR(bml.start) <= 2011 
AND     (YEAR(bml.end) >= 2011 OR bml.end IS NULL)

You are operating the YEAR function twice for every row, and removing any possible chance of using an index on bml.Start or bml.End (if any exist). 您对每一行都使用了两次YEAR函数,并消除了在bml.Startbml.End (如果存在)上使用索引的任何可能的机会。 Yet again Aaron Bertrand has written a nice article highlighting good practises when querying date ranges, it is target at SQL-Server, but the principles are still the same, so your where clause would become: Aaron Bertrand再次写了一篇不错的文章,着重介绍了查询日期范围时的良好做法,它的目标是SQL-Server,但是原理仍然相同,因此where子句将变为:

WHERE   bml.Start <= '20110101'
AND     (bml.End >= '20110101' OR bml.End IS NULL)
AND     bml.start > 0 

Your final query should then be: 您的最终查询应为:

SELECT  bml.Entity,
        GROUP_CONCAT(DISTINCT CONCAT(bm.First, ' ', bm.Last, 
            IF (bm.Occupation != '', ' - ', ''), bm.Occupation) 
            SEPARATOR "\n") as Board
FROM    TB_Main arfe
        INNER JOIN TB_BoardMembersLINK  bml
            ON bml.Entity = arfe.ID
        INNER JOIN TB_BoardMembers  bm
            ON bm.ID = bml.BoardMemberID
WHERE   bml.Start <= '20110101'
AND     (bml.End >= '20110101' OR bml.End IS NULL)
AND     bml.start > 0
GROUP BY bml.Entity
ORDER BY Board;

Example on SQL Fiddle SQL小提琴示例

If you read up on Group_Concat 如果您阅读Group_Concat

"This function returns a string result with the concatenated non-NULL values from a group." “此函数返回一个字符串结果,其中包含来自组的串联的非NULL值。”

Here in this case, the group seems to be just one group, as you say there is only one entity? 在这种情况下,该组似乎只是一个组,因为您说只有一个实体? I am not sure if that is the case from your description. 我不确定您的描述是否属实。 Why dont you also group by firstname, lastname and Occupation, this may give you all the members. 为什么不同时按名字,姓氏和职业分组,这可能会给您所有成员。

I am also not sure of your joins, without real data its tough to explain that part as every join works for some set of data properly, even though its not the best way to write a query 我也不确定您的联接,没有真实数据就很难解释该部分,因为每个联接都可以正确地处理某些数据集,即使这不是编写查询的最佳方式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM