简体   繁体   中英

mySQL: SUM a column based on another column's entry

I have two tables, one is users and one is user_answers.

A user can take a survey. The survey has a leading question which determines the following question set. If the user wants to change their question set before completing, they can go back and re-answer.

The problem I'm having is that I can't get MYSQL to tally the results accurately. So far I have this:

select q_id, 
sum(IF(answer like '%A%',1,0)) as A, 
sum(IF(answer like '%B%',1,0)) as B, 
sum(IF(answer like '%C%',1,0)) as C, 
sum(IF(answer like '%D%',1,0)) as D,
sum(IF(answer like '%E%',1,0)) as E,
sum(IF(answer like '%F%',1,0)) as F
from user_answers as t1
join 
(   select distinct id,`date`
    from users
    WHERE finished = 1
    AND date BETWEEN "2015-09-04" AND "2015-09-10"  
) inr
on inr.id=t1.user_id
group by q_id;

This gives me the counts of all user's answers in a nice columned rubric. But some people changed their leading question ( q_id=0 ), and this script is still counting their answers to other question sets they completed. A person could answer all 0-11(stored in the same table), but I only want to sum whatever chunk they chose based on question 0.

If I were to pseudo write this terribly in another language, I would do it like:

foreach(user_id){
  $result = mysqlfetch(select * from user_answers where uid=user_id and q_id=0);
  if(this.q_id(0).response = A){
    //questions 1-4 get added to tally
  }
  if(this.q_id(0).response = B){
    //questions 5-8 get added to tally
  }
  if(this.q_id(0).response = C){
     //questions 9-11 get added to tally
  }
}

But I don't know how to conditionally SUM in the mysql script based on the user's q_0 response when they haven't even been joined yet, in my example. Sorry for messed up table, I didn't foresee this being a problem for writing a script.

You can just join in the information, with another aggregation:

select t1.q_id, 
       sum(t1.answer like '%A%') as A, 
       sum(t1.answer like '%B%') as B, 
       sum(t1.answer like '%C%') as C, 
       sum(t1.answer like '%D%') as D, 
       sum(t1.answer like '%E%') as E, 
       sum(t1.answer like '%F%') as F
from user_answers t1 join 
     (select distinct id, `date`
      from users
      where finished = 1 and date BETWEEN '2015-09-04' AND '2015-09-10' 
     ) inr
     on inr.id = t1.user_id join
     user_answers q0
     on q0.user_id = t1.user_id and q0.q_id = 0
where (q0.response = 'A' and t1.q_id in (1, 2, 3, 4)) or
      (q0.response = 'B' and t1.q_id in (5, 6, 7, 8)) or
      (q0.response = 'C' and t1.q_id in (9, 10 11)) 
group by t1.q_id;

Some people might prefer to put the logic in the last on clause rather than the where clause. This is strictly a matter of preference. I like to see more complex logic in the where .

Note: you still might not get the results that you want. The inr subquery might return duplicates, if a user is recorded on more than one date. Either remove date from the subquery or include it in the outer group by , if this is an issue.

  1. Create a table to hold your tally criteria:

     CREATE TABLE answers_to_tally ( answer0 CHAR(1), q_id INT ); INSERT INTO answers_to_tally (answer0, q_id) VALUES ('A', 1), ('A', 2), ('A', 3), ('A', 4), ('B', 5), ('B', 6), ('B', 7), ('B', 8), ('C', 9), ('C',10), ('C',11) ; 
  2. By joining that table with your user_answers table, you can obtain a set of (user_id, q_id) pairs indicating which questions should be tallied for which users:

     SELECT a.user_id, t.q_id FROM user_answers AS a JOIN answers_to_tally AS t ON a.answer = t.answer0 AND a.q_id = 0 
  3. The whole shebang can then be put together like this:

     SELECT q_id, SUM(FIND_IN_SET('A', answer) > 0) AS A, SUM(FIND_IN_SET('B', answer) > 0) AS B, SUM(FIND_IN_SET('C', answer) > 0) AS C, SUM(FIND_IN_SET('D', answer) > 0) AS D, SUM(FIND_IN_SET('E', answer) > 0) AS E, SUM(FIND_IN_SET('F', answer) > 0) AS F FROM user_answers NATURAL JOIN ( SELECT a.user_id, t.q_id FROM user_answers AS a JOIN users AS u ON a.user_id = u.id JOIN answers_to_tally AS t ON a.answer = t.answer0 AND a.q_id = 0 WHERE u.finished = 1 AND u.date BETWEEN '2015-09-04' AND '2015-09-10' ) x GROUP BY q_id 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM