简体   繁体   中英

SQL: Joining on a group function

My table is:

Table 1

Uid, hospital_nr, department_nr, diagnosis_nr, case_amount

My query in words: I want to find out, for each hospital, which department has the most cases of three particular diagnoses.

To find out the sum of the cases for each department of these diagnoses I use:

SELECT hospital_nr, department_nr, sum(case_amount) AS cases 
FROM Table_1 
WHERE diagnosis_nr = 1 OR diagnosis_nr = 3 OR diagnosis_nr = 4
GROUP BY hospital_nr, department_nr;

To find out the maximum amount of cases a department has for the individual hospitals I use:

SELECT b.hospital_nr, max(a.sum_of_cases) AS max_sum_of_cases
FROM hdiag_data2014 AS b,
(SELECT Hospital_nr, department_nr, sum(case_amount) AS sum_of_cases 
FROM Table_1 
WHERE diagnosis_nr = 1 OR diagnosis_nr = 3 OR diagnosis_nr = 4
GROUP BY hospital_nr, department_nr) AS a
WHERE diagnosis_nr = 1 OR diagnosis_nr = 3 OR diagnosis_nr = 4
AND b.hospital_nr = a.hospital_nr 
GROUP BY b.hospital_nr;

Now I want to join these two tables in an INNER JOIN and have tried this:

SELECT c.hospital_nr, c.department_nr, sum(case_amount) AS cases 
FROM Table_1 AS c
INNER JOIN
    (SELECT b.hospital_nr max(a.sum_of_cases) AS max_sum_of_cases
    FROM Table_1 AS b,
        (SELECT hospital_nr, department_nr, sum(case_amount) AS sum_of_cases
        FROM Table_1 
        WHERE diagnosis_nr = 1 OR diagnosis_nr = 3 OR diagnosis_nr = 4
    GROUP BY hospital_nr, department_nr) AS a
    WHERE b.diagnosis_nr = 1 OR b.diagnosis_nr = 3 OR b.diagnosis_nr = 4
    AND b.hospital_nr = a.hospital_nr 
    GROUP BY b.hospital_nr) AS b 
ON c.cases = b.max_sum_of_cases
WHERE c.diagnosis_nr = 1 OR c.diagnosis_nr = 3 OR c.diagnosis_nr = 4
GROUP BY c.hospital_nr;

This script is not allowing me to join via this ON because it says that it does not recognize “cases” as a column. Why is that? How can I improve it? My first path to answer my “verbal query” was to work via the HAVING clause, yet that was also unsuccessful as it did not allow me to filter by the departments with the maximum amount of cases. Is there something in this alternative path that I overlooked?

I would suggest using a substring_index() / group_concat() trick:

SELECT hospital_nr,
       SUBSTRING_INDEX(GROUP_CONCAT(department_nr ORDER BY cases DESC), ',', 1) as max_department_nr
FROM (SELECT hospital_nr, department_nr, sum(case_amount) AS cases 
      FROM Table_1 
      WHERE diagnosis_nr in (1, 3, 4)
      GROUP BY hospital_nr, department_nr
     ) hd
GROUP BY hospital_nr;

There are other approaches, but this method is generally the simplest in MySQL.

Note: This assumes that department_nr does not contain commas.

I think you're missing the alias 'c.' in the join. Some other aliases are missing as well, so I'm not sure which table they should pull from.

SELECT c.hospital_nr, c.department_nr, c.cases 
FROM (SELECT hospital_nr, department_nr, sum(case_amount) AS cases
        FROM Table_1 
        WHERE diagnosis_nr in (1, 3, 4)
    GROUP BY hospital_nr, department_nr) AS c
INNER JOIN
    ((SELECT b.hospital_nr, max(a.sum_of_cases) AS max_sum_of_cases
    FROM hdiag_data2014) AS b,
        (SELECT hospital_nr, department_nr, sum(case_amount) AS sum_of_cases
        FROM Table_1 
        WHERE diagnosis_nr in (1, 3, 4)
    GROUP BY hospital_nr, department_nr) AS a
    WHERE b.diagnosis_nr in (1, 3, 4)
    AND b.hospital_nr = a.hospital_nr 
    GROUP BY b.hospital_nr) AS b 
ON c.cases = b.max_sum_of_cases);

You can use an ordered correlated subquery with LIMIT 1 in the WHERE clause to filter the department_nr with the highest sum of case_amount :

SELECT DISTICT hospital_nr, department_nr
FROM Table_1 t1
WHERE department_nr = (
    SELECT department_nr 
    FROM Table_1 t2
    WHERE t2.hospital_nr = t1.hospital_nr
      AND t2.diagnosis_nr IN (1, 3, 4)
    ORDER BY sum(case_amount) DESC
    LIMIT 1
)

If you also need the sum, you will need to calculate it once more:

SELECT hospital_nr, department_nr, sum(case_amount) AS cases
FROM Table_1 t1
WHERE department_nr = (
    SELECT department_nr 
    FROM Table_1 t2
    WHERE t2.hospital_nr = t1.hospital_nr
      AND t2.diagnosis_nr IN (1, 3, 4)
    ORDER BY sum(case_amount) DESC
    LIMIT 1
)
GROUP BY hospital_nr, department_nr

Note: If two departments have the same sum, the query will "pick" only one. If you want to define which one to pick in that case, you should add a column (eg. department_id ) in the ORDER BY clause.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM