简体   繁体   中英

MySQL Query GROUP_CONCAT Over Multiple Rows

I'm getting name and address data out of generic question / answer data to create some kind of normalised reporting database.

The query I've got uses group_concat and works for individual sets of questions but not for multiple sets.

I've tried to simplify what I'm doing by using just forename and surname and just 3 records, 2 for 1 person and 1 for another. In reality though there are more than 300,000 records.

Example of results with qs.Id = 1 .

QuestionSetId       Forename        Surname
-------------------------------------------------------
            1       Bob             Jones

Example of results with qs.Id IN (1, 2, 3) .

QuestionSetId       Forename        Surname
-------------------------------------------------------
            3       Bob,Bob,Frank   Jones,Jones,Smith

What I would like to see for qs.Id IN (1, 2, 3) .

QuestionSetId       Forename        Surname
-------------------------------------------------------
            1       Bob             Jones
            2       Bob             Jones
            3       Frank           Smith

So how can I make the 2nd example return a separate row for each set of name and address information?

I realise the current way the data is stored is "questionable" but I cannot change the way the data is stored.

I can get sets of individual answers but not sure how to combine the others.

My simplified Schema that I cannot change:

CREATE TABLE StaticQuestion (
  Id INT NOT NULL,
  StaticText VARCHAR(500) NOT NULL);

CREATE TABLE Question (
  Id INT NOT NULL, 
  Text VARCHAR(500) NOT NULL);

CREATE TABLE StaticQuestionQuestionLink (
  Id INT NOT NULL,
  StaticQuestionId INT NOT NULL,
  QuestionId INT NOT NULL,
  DateEffective DATETIME NOT NULL);

CREATE TABLE Answer (
  Id INT NOT NULL,
  Text VARCHAR(500) NOT NULL);

CREATE TABLE QuestionSet (
  Id INT NOT NULL,
  DateEffective DATETIME NOT NULL);

CREATE TABLE QuestionAnswerLink (
  Id INT NOT NULL,
  QuestionSetId INT NOT NULL,
  QuestionId INT NOT NULL,
  AnswerId INT NOT NULL,
  StaticQuestionId INT NOT NULL);

Some example data for only forename and surname.

INSERT INTO StaticQuestion (Id, StaticText)
VALUES (1, 'FirstName'),
       (2, 'LastName');

INSERT INTO Question (Id, Text)
VALUES (1, 'What is your first name?'),
       (2, 'What is your forename?'),
       (3, 'What is your Surname?');

INSERT INTO StaticQuestionQuestionLink (Id, StaticQuestionId, QuestionId, DateEffective)
VALUES (1, 1, 1, '2001-01-01'),
       (2, 1, 2, '2008-08-08'),
       (3, 2, 3, '2001-01-01');

INSERT INTO Answer (Id, Text)
VALUES (1, 'Bob'),
       (2, 'Jones'),
       (3, 'Bob'),
       (4, 'Jones'),
       (5, 'Frank'),
       (6, 'Smith');

INSERT INTO QuestionSet (Id, DateEffective)
VALUES (1, '2002-03-25'),
       (2, '2009-05-05'),
       (3, '2009-08-06');

INSERT INTO QuestionAnswerLink (Id, QuestionSetId, QuestionId, AnswerId, StaticQuestionId)
VALUES (1, 1, 1, 1, 1),
       (2, 1, 3, 2, 2),
       (3, 2, 2, 3, 1),
       (4, 2, 3, 4, 2),
       (5, 3, 2, 5, 1),
       (6, 3, 3, 6, 2);

Just in case SQLFiddle is down here are the 3 queries from the examples I've linked to:

1: - working query but only on 1 set of data.

SELECT MAX(QuestionSetId) AS QuestionSetId,
       GROUP_CONCAT(Forename) AS Forename,
       GROUP_CONCAT(Surname) AS Surname
FROM   (SELECT
            x.QuestionSetId,
            CASE x.StaticQuestionId WHEN 1 THEN Text END AS Forename,
            CASE x.StaticQuestionId WHEN 2 THEN Text END AS Surname        
        FROM     (SELECT (SELECT   link.StaticQuestionId
                          FROM     StaticQuestionQuestionLink link
                          WHERE    link.Id = qa.QuestionId
                               AND link.DateEffective <= qs.DateEffective
                               AND link.StaticQuestionId IN (1, 2)
                          ORDER BY link.DateEffective DESC LIMIT 1) AS StaticQuestionId,
                         a.Text,
                         qa.QuestionSetId
                  FROM   QuestionSet qs
                            INNER JOIN QuestionAnswerLink qa ON qs.Id = qa.QuestionSetId
                            INNER JOIN Answer a ON qa.AnswerId = a.Id
                  WHERE  qs.Id IN (1)) x) y

2: - working query but undesired results on multiple sets of data.

SELECT MAX(QuestionSetId) AS QuestionSetId,
       GROUP_CONCAT(Forename) AS Forename,
       GROUP_CONCAT(Surname) AS Surname
FROM   (SELECT
            x.QuestionSetId,
            CASE x.StaticQuestionId WHEN 1 THEN Text END AS Forename,
            CASE x.StaticQuestionId WHEN 2 THEN Text END AS Surname        
        FROM     (SELECT (SELECT   link.StaticQuestionId
                          FROM     StaticQuestionQuestionLink link
                          WHERE    link.Id = qa.QuestionId
                               AND link.DateEffective <= qs.DateEffective
                               AND link.StaticQuestionId IN (1, 2)
                          ORDER BY link.DateEffective DESC LIMIT 1) AS StaticQuestionId,
                         a.Text,
                         qa.QuestionSetId
                  FROM   QuestionSet qs
                            INNER JOIN QuestionAnswerLink qa ON qs.Id = qa.QuestionSetId
                            INNER JOIN Answer a ON qa.AnswerId = a.Id
                  WHERE  qs.Id IN (1, 2, 3)) x) y

3: - working query on multiple sets of data only on 1 field (answer) though.

SELECT
    qs.Id AS QuestionSet,
    a.Text AS Answer
FROM
    QuestionSet qs
        INNER JOIN QuestionAnswerLink qalink ON qs.Id = qalink.QuestionSetId
        INNER JOIN StaticQuestionQuestionLink sqqlink ON qalink.QuestionId = sqqlink.QuestionId
        INNER JOIN Answer a ON qalink.AnswerId = a.Id
WHERE
        sqqlink.StaticQuestionId = 1 /* FirstName */
    AND sqqlink.DateEffective = 
            (SELECT DateEffective
               FROM StaticQuestionQuestionLink
              WHERE StaticQuestionId = 1 
                AND DateEffective <= qs.DateEffective
           ORDER BY DateEffective
               DESC 
              LIMIT 1)

@PeteGo, Try this

SELECT
    qs.Id AS QuestionSet,
    GROUP_CONCAT(a.Text SEPARATOR ', ')  AS Answer
FROM
    QuestionSet qs
        INNER JOIN QuestionAnswerLink qalink ON qs.Id = qalink.QuestionSetId
        INNER JOIN StaticQuestionQuestionLink sqqlink ON qalink.QuestionId = sqqlink.QuestionId
        INNER JOIN Answer a ON qalink.AnswerId = a.Id
WHERE
        sqqlink.StaticQuestionId in (1,2,3) /* FirstName */
GROUP BY qs.Id;

OR

SELECT
    qs.Id AS QuestionSet,
    group_CONCAT(b.Text ORDER BY b.Id )   AS Answer
FROM
    QuestionSet qs
        INNER JOIN QuestionAnswerLink qalink ON qs.Id = qalink.QuestionSetId
        INNER JOIN StaticQuestionQuestionLink sqqlink ON qalink.QuestionId = sqqlink.QuestionId
        INNER JOIN Answer a ON qalink.AnswerId = a.Id
INNER JOIN Answer b ON qalink.AnswerId = b.Id
WHERE
        sqqlink.StaticQuestionId in (1,2,3)  
   group by qs.Id  ;

OR

select qs.Id, group_concat(a.Text order by a.Id) from QuestionAnswerLink qalink
left join QuestionSet qs on qalink.QuestionSetId=qs.Id
left join Answer a on qalink.AnswerId = a.Id


 left join QuestionSet qs1 on qalink.QuestionSetId=qs1.Id
left join Answer b on qalink.AnswerId = b.Id


group by qs.Id ;

Stealing from both @PixelMaker and @PeteGO, I'll throw this one in

SELECT qs.Id AS QuestionSetId,
       GROUP_CONCAT(a.Text order by a.id) AS Answer
FROM QuestionSet qs
JOIN QuestionAnswerLink qa ON qs.Id = qa.QuestionSetId
JOIN StaticQuestionQuestionLink link ON qa.QuestionId = link.QuestionId
JOIN Answer a ON qa.AnswerId = a.Id
WHERE link.Id = qa.QuestionId
      AND link.DateEffective <= qs.DateEffective
      AND link.StaticQuestionId IN (1, 2)
      and qs.id in (1,2,3)
GROUP BY qs.Id

SQL Fiddle

and finally this one SQL Fiddle

SELECT qs.Id AS QuestionSetId,
       GROUP_CONCAT(case link.staticquestionid when 1 then a.Text end) AS forename,
       GROUP_CONCAT(case link.staticquestionid when 2 then a.Text end) AS surname
FROM QuestionSet qs
JOIN QuestionAnswerLink qa ON qs.Id = qa.QuestionSetId
JOIN StaticQuestionQuestionLink link ON qa.QuestionId = link.QuestionId
JOIN Answer a ON qa.AnswerId = a.Id
WHERE link.Id = qa.QuestionId
      AND link.DateEffective <= qs.DateEffective
      AND link.StaticQuestionId IN (1, 2)
      and qs.id in (1,2,3)
GROUP BY qs.Id

which gives the desired result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM