Prioritize Distinct SQL Query

Question

I am trying to write a VBA Sql Query that will return distinct records, but I want to be able to prioritize the distinct record it returns. In this test example, I would want to return a distinct record for each ID, prioritize the year as Senior, then Junior, Sophomore, and Freshman, then sub-prioritize for the term in the order C, A, then B.

ID  Name    Term   Year
1   Suzy    A      Senior
1   Suzy    B      Junior
2   John    C      Senior
2   John    B      Sophomore
3   Pete    A      Junior
4   Carl    C      Freshman
5   Sally   B      Senior

When using the "Distinct" operator in a SQL query, I don't know what criteria it uses to decide which record to return or if it is possible to place a priority like this. Any help would be much appreciated. I think I saw a possible solution using "Select Case", but I didn't quite understand it. If that is the correct solution, I'll be happy to use it, but I would like an explanation as to how it works if possible.

Thanks!

Answer 1

Assuming that Y is an auxiliary table to order the column "Year":

Year      Order
--------- -----
Senior    1
Junion    2
Sophomore 3
Freshman  4

Assuming X is the table that contains your data and that all rows are distinct themselves (that is, there is no record in X where ID, Name, Term and Year are the same), and also supposing that same ID implies same Name and vice-versa:

SELECT  x1.*
FROM    X x1
        INNER JOIN Y y1 ON (x1.year = y1.year)
WHERE   NOT EXISTS (
            SELECT  *
            FROM    X x2
                    INNER JOIN Y y2 ON (x2.year = y2.year)
            WHERE   x2.id = x1.id -- could also use "name" instead of id (as one implies the other)
                AND (
                        y2.order < y1.order
                    OR  (y2.order = y1.order AND x2.term < x1.term)
                    )
        )

Edit:

I'll try to clarify what's going on by simplifying (and generalizing) the above scheme somewhat.

What you're trying to accomplish is to select a set of output rows, each output row unique for a given key. The other columns should all come from the same underlying row (input row), according to some ordering criteria.

SELECT  *
FROM    YourTable t1
WHERE   NOT EXISTS (
            SELECT  *
            FROM    YourTable t2
            WHERE   t1.key = t2.key -- note that you can have composite keys connecting them with AND
                AND (
                    -- any ordering criteria that will give some sense of
                    -- this instance t2 is lesser (or have more priority) than
                    -- the candidate instance t1
                )
        )

(this is a very basic concept; read it as "get these rows given that each one does not have any other, of its kind, that is lesser than it)

Your problem got a bit more complicated by the fact that your "Year" column does not have a natural ordering, so we have to invent one.

That's were this metatable comes from: to map Year to some ordering.

Now you have to apply the general idea explained just above to your augmented table (that is, your original one with my suggested metatable).

The ordering criteria also has a common language. Because your order has more than one level ("order by this, break ties by that"), you use the following general construction:

    (t2.first_criteria < t1.first_criteria)
OR  (t2.first_criteria = t1.first_criteria AND t2.second_criteria < t1.second_criteria)
OR  (t2.first_criteria = t1.first_criteria AND t2.second_criteria = t1.second_criteria AND t2.third_criteria = t1.third_criteria)
...

(... any other of its kind, that is strictly lesser by the first criteria, or equal by the first but strictly lesser by the second, or equal by the first and second and lesser by the third...)

Prioritize Distinct SQL Query

Question

1 answers

solution1
1 ACCPTED 2014-06-23 17:23:27

Prioritize Distinct SQL Query

Question

1 answers

solution1 1 ACCPTED 2014-06-23 17:23:27

solution1
1 ACCPTED 2014-06-23 17:23:27