简体   繁体   中英

How to make correct calculations?

In a PostgreSQL database I have table called answers . This table stores information about how users answered a questions. There are only 4 question in the table. At the same time, the number of users who answered the questions can be dynamic and the user can answer only part of the questions.

Table answers :

| EMPLOYEE | QUESTION_ID | QUESTION_TEXT          | OPTION_ID | OPTION_TEXT  |
|----------|-------------|------------------------|-----------|--------------|
| Bob      | 1           | Do you like soup?      | 1         | Yes          |
| Alex     | 1           | Do you like soup?      | 1         | Yes          |
| Kate     | 1           | Do you like soup?      | 3         | I don't know |
| Bob      | 2           | Do you like ice cream? | 1         | Yes          |
| Alex     | 2           | Do you like ice cream? | 3         | I don't know |
| Oliver   | 2           | Do you like ice cream? | 1         | Yes          |
| Bob      | 3           | Do you like summer?    | 2         | No           |
| Alex     | 3           | Do you like summer?    | 1         | Yes          | 
| Jack     | 3           | Do you like summer?    | 2         | No           |
| Bob      | 4           | Do you like winter?    | 3         | I don't know |
| Alex     | 4           | Do you like winter?    | 1         | Yes          |
| Oliver   | 4           | Do you like winter?    | 3         | I don't know |

I need this result:

| EMPLOYEE | CALC |
|----------|------|
| Bob      | 2    |
| Alex     | 2    |
| Kate     | 1    |
| Jack     | 1    |
| Oliver   | 2    |

The calc column is calculated by the formula:

CALC = A + B;

A - If a user answered to first and/or second question then the value should be 1, otherwise 0.
B - If a user answered to third and/or fourth question then the value should be 1, otherwise 0.

For example Bob answered to all 4 question. For that's why calc column has value 2 for Bob. In the same time Kate answered only for first question. For that's why calc column has value 1 for Kate. In her case A is 1 and B is 0.

Right now I tried such code but it's work not as I expected:

select
    employee,
    (
        case when count(question_id = 1) or count(question_id = 2) > 0 then 1 else 0 end
        +
        case when count(question_id = 3) or count(question_id = 4) > 0 then 1 else 0 end
    ) as calc
from
    answers
group by
    employee

You can try to use condition aggravated function SUM distinct with group by .

Query 1 :

SELECT employee,
       (SUM(DISTINCT CASE WHEN QUESTION_ID IN (1,2) THEN 1 ELSE 0 END) + 
       SUM(DISTINCT CASE WHEN QUESTION_ID IN (3,4) THEN 1 ELSE 0 END)) CALC 
FROM answers
GROUP BY employee

Results :

| employee | calc |
|----------|------|
|     Alex |    2 |
|      Bob |    2 |
|     Jack |    1 |
|     Kate |    1 |
|   Oliver |    2 |

Similar to D-Shih's answer this could also be achieved by doing a count with a filter

select
    employee,
    (
        case
            when count(question_id) filter (where question_id in(1, 2)) > 0
            then 1
            else 0
        end +
        case
            when count(question_id) filter (where question_id in(3, 4)) > 0
            then 1
            else 0
        end
    ) as calc
from answers
group by employee
order by employee

In Postgres, I would phrase this as conditional aggregation, but not with COUNT(DISTINCT) :

select employee,
       (max( (question_id in (1, 2))::int ) +
        max( (question_id in (3, 4))::int )
       ) as calc
from answers
group by employee;

In addition to being more concise, count(distinct) generally incurs more overhead than the more "basic" aggregation functions such as min() , max() , count() and sum() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM