简体   繁体   中英

How to count distinct values from two columns into one number(follow up)

This is a follow up question to How to count distinct values from two columns into one number

I wanted to know how to do the counting part and neglected that i am already joining some other tables into the mix.

The answer given on the previous question is the correct one for that case.

Here's my additional problem now.

I have 3 tables:

Assignments

+----+-------------------+
| id |       name        |
+----+-------------------+
| 1  | first-assignment  |
| 2  | second-assignment |
+----+-------------------+

Submissions

+----+---------------+------------+
| id | assignment_id | student_id |
+----+---------------+------------+
|  1 |             1 |          2 |
|  2 |             2 |          1 |
|  3 |             1 |          3 |
+----+---------------+------------+

Group_submissions

+----+---------------+------------+
| id | submission_id | student_id |
+----+---------------+------------+
| 1  |             1 |          1 |
| 2  |             2 |          2 |
+----+---------------+------------+

Each submission belongs to an assignment.

Submissions can be an individual submission or a group submission

When they are individual the one that did the submission in an assignment(assignment_id) goes into the submissions table(student_id)

When they are group submission the same thing happens with two additional details:

  1. The one that does the submission goes into the submissions table
  2. The others go to the group_submissions table and are associated with the id in the submissions table (so submission_id is a FK from the submissions table)

I want to return every assignment with it's columns, but also add the number of students that have made submissions into that assignment. Keep in mind that students that haven't done the submission(are not in the submissions table) but have participated in a group submission (are in the group_submissions table) also count

Something like this:

+----+-------------------+----------+
| id |       name        | students |
+----+-------------------+----------+
| 1  | first-assignment  |       11 |
| 2  | second-assignment |        2 |
+----+-------------------+----------+

I tried 2 ways of getting the numbers:

count(distinct case when group_submissions.student_id is not null then
group_submissions.student_id when assignment_submissions.student_id is
not null then assignment_submissions.student_id end)

This doesn't work because the case statement will short circuit once the first condition is met. For example: If one student has done group submissions but has never actually done the submission he/she will be displayed on the group_submissions table only. So if on the submissions table the id is 1 and on the group_submission table the id is 2, and id 2 does not occur on the submissions table it will not be counted.

count(distinct case when group_submissions.student_id is not null then group_submissions.student_id end) 
+ count(distinct case when submissions.student_id is not null then submissions.student_id end)

This one doesn't work because it gives duplicates if a student is in both tables.

NOTE: This is a MySQL database

As students are either in submissions table or in group_submissions you can just simply join the tables and add the columns:

SELECT a.id,COUNT(s.student_id)+COUNT(gs.student_id) FROM assignments a
JOIN submissions s ON a.id = s.assignment_id
LEFT JOIN group_submissions gs ON s.id = gs.submission_id
GROUP BY a.id;

If there are duplicates, ie student can be both in submissions and group_submissions tables, then you can union the two and select from there:

SELECT assignment_id,COUNT(DISTINCT student_id)
FROM (
    SELECT assignment_id,student_id
    FROM submissions
    UNION
    SELECT assignment_id,gs.student_id
    FROM group_submissions gs
        JOIN submissions s on gs.submission_id = s.id) T1
GROUP BY assignment_id;

Since you can't change the data, you'll need to use a UNION subquery, and then aggregate over that.

SELECT a.id, a.name, COUNT(DISTINCT x.student_id) AS students
FROM Assignments AS a
LEFT JOIN (
   SELECT assignment_id, student_id FROM Submissions
   UNION 
   SELECT s.assignment_id, g.student_id
   FROM Submissions AS s
   INNER JOIN Group_submissions AS g ON s.id = g.submission_id
) AS x ON a.id = x.assignment_id
GROUP BY a.id, a.name
;

Edit: vhu's first part is better as long as you cannot have assignment X submitted by student Y with a group_submission credit of student Z, and another for assignment X submitted directly by student Z or having a group_submission credit or student Y (because then they would be counted twice).

You tagged the question already as mysqk the version number is usually also interesting for a good answer

following gives you a correct answer

SELECT  
  a.id,a.name
  , LENGTH(CONCAT(GROUP_CONCAT(s.`student_id`) ,IF(GROUP_CONCAT(gs.student_id) is NULL,'',','),IF(GROUP_CONCAT(gs.student_id) is NULL,'',GROUP_CONCAT(gs.student_id))))
   - LENGTH(REPLACE(CONCAT(GROUP_CONCAT(s.`student_id`) ,IF(GROUP_CONCAT(gs.student_id) is NULL,'',','),IF(GROUP_CONCAT(gs.student_id) is NULL,'',GROUP_CONCAT(gs.student_id))), ',', '')) + 1 as count_studints
FROM 
  Submissions s 
  LEFT JOIN Group_submissions gs ON gs.submission_id = s.id 
  INNER JOIN Assignments a on s.assignment_id = a.id
WHERE s.`student_id` NOT IN (SELECT student_id 
                           FROM Group_submissions gs 
                           WHERE gs.submission_id = s.id)
GROUP BY a.id,a.name;
 CREATE TABLE Group_submissions ( `id` INTEGER, `submission_id` INTEGER, `student_id` INTEGER ); INSERT INTO Group_submissions (`id`, `submission_id`, `student_id`) VALUES ('1', '1', '1'), ('2', '2', '2'); CREATE TABLE Submissions ( `id` INTEGER, `assignment_id` INTEGER, `student_id` INTEGER ); INSERT INTO Submissions (`id`, `assignment_id`, `student_id`) VALUES ('1', '1', '2'), ('2', '2', '1'), ('3', '1', '3'), ('4', '3', '1'); CREATE TABLE Assignments ( `id` INTEGER, `name` VARCHAR(17) ); INSERT INTO Assignments (`id`, `name`) VALUES ('1', 'first-assignment'), ('2', 'second-assignment'), ('3', 'third-assignment');
\n \n\n \n\n \n\n \n\n \n\n \n
SELECT a.id,a.name , LENGTH(CONCAT(GROUP_CONCAT(s.`student_id`) ,IF(GROUP_CONCAT(gs.student_id) is NULL,'',','),IF(GROUP_CONCAT(gs.student_id) is NULL,'',GROUP_CONCAT(gs.student_id)))) - LENGTH(REPLACE(CONCAT(GROUP_CONCAT(s.`student_id`) ,IF(GROUP_CONCAT(gs.student_id) is NULL,'',','),IF(GROUP_CONCAT(gs.student_id) is NULL,'',GROUP_CONCAT(gs.student_id))), ',', '')) + 1 as count_studints from Submissions s LEFT JOIN Group_submissions gs ON gs.submission_id = s.id INNER JOIN Assignments a on s.assignment_id = a.id WHERE s.`student_id` NOT IN (SELECT student_id FROM Group_submissions gs WHERE gs.submission_id = s.id) GROUP BY a.id,a.name;
\nid | name | count_studints\n-: |  :---------------- |  -------------: \n 1 |  first-assignment | 3 \n 2 |  second-assignment | 2 \n 3 |  third-assignment | 1 \n

db<>fiddle here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM