ROW_NUMBER() with DISTINCT

Question

I've got a table of ticket assignments showing the different groups a ticket is transferred to before its resolved. Here is a simplified table:

asgn_grp | date   | ticket_id
---------|--------|----------
       A | 1-1-15 | 1
       A | 1-2-15 | 1
       B | 1-3-15 | 1
       A | 1-1-15 | 2
       C | 1-2-15 | 2
       B | 1-3-15 | 2
       C | 1-1-15 | 3
       B | 1-2-15 | 3

I need to get a count of the second distinct group that a ticket was assigned to, meaning I want to know once a ticket is transferred out of the group its in, internal transfers don't count. So the second distinct group for ticket 1 is B , ticket 2 is C , ticket 3 is B . I need to get a count of these, so the end result I need is

asgn_grp | count
---------|-------
       B | 2
       C | 1

I've tried

SELECT distinct top 2 asgn_grp, ROW_NUMBER() OVER (ORDER BY date)

As my sub-query and pulling the second one out of that, but when I add the ROW_NUMBER() it messes up my distinct. If I pull the ROW_NUMBER() out of the sub-query, I have now way to order my values to ensure I get the second one after I DISTINCT the list.

Also, let me know if I was unclear about anything.

Answer 1

First, you need to identify groups of constant values of asgn_grp for each ticket. You can do that with a difference of row numbers.

Then, you need the ordering for each group. For that, use the minimum date in the group. Finally, you can rank these groups to get the second one, using dense_rank() on the date.

select asgn_grp, count(*)
from (select ticket_id, asgn_grp,
             dense_rank() over (partition by ticket_id order by grpdate) as seqnum
      from (select s.*, min(date) over (partition by ticket_id, asgn_grp, grp) as grpdate
            from (select s.*,
                         (row_number() over (partition by ticket_id order by date) -
                          row_number() over (partition by ticket_id, asgn_grp order by date)
                         ) as grp
                  from simplified s
                 ) s
           ) s
    ) s
where seqnum = 2
group by asgn_grp;

Answer 2

Instead of using distinct, try using group by twice.

select asgn_grp, count(*) from (
    select * , row_number() over (partition by ticket_id order by min_date) rn
    from (
        select asgn_grp, ticket_id, min(date) min_date
        from Table1 group by asgn_grp, ticket_id
    ) t1 
) t2 where rn = 2
group by asgn_grp;

http://sqlfiddle.com/#!3/a0d1e

The derived table t1 contains every unique asgn_grp for each ticket_id along with the minimum date of each asgn_grp . For the sample data t1 has the following rows:

ASGN_GRP    TICKET_ID   MIN_DATE
A   1   January, 01 2015 00:00:00+0000
B   1   January, 03 2015 00:00:00+0000
A   2   January, 01 2015 00:00:00+0000
B   2   January, 03 2015 00:00:00+0000
C   2   January, 02 2015 00:00:00+0000
B   3   January, 02 2015 00:00:00+0000
C   3   January, 01 2015 00:00:00+0000

The outer query then uses row_number() to number each asgn_grp within a ticket_id by its min_date and generates the following for t2

ASGN_GRP    TICKET_ID   MIN_DATE    RN
A   1   January, 01 2015 00:00:00+0000  1
B   1   January, 03 2015 00:00:00+0000  2
A   2   January, 01 2015 00:00:00+0000  1
C   2   January, 02 2015 00:00:00+0000  2
B   2   January, 03 2015 00:00:00+0000  3
C   3   January, 01 2015 00:00:00+0000  1
B   3   January, 02 2015 00:00:00+0000  2

This table is filtered for RN = 2 and is grouped by asgn_grp to get the count for each asgn_grp .

Answer 3

If you need all assign groups with count zero for non-changed ones, use outer joins instead of inner joins

 WITH TBL AS
 (
 SELECT A.*, ROW_NUMBER() OVER(PARTITION BY ticket_id ORDER BY asgn_grp) AS RN
 FROM TABLE AS A
 )

 SELECT A.ASSN_GRP, COUNT(*) AS CNT
 FROM TBL AS A
 INNER JOIN TBL AS B
 ON B.TICKET_ID = A.TICKET_ID
 AND A.RN = B.RN + 1 
 GROUP BY A.ASSGN_GRP

Answer 4

As you want to know why using DISTINCT with ROW_NUMBER() changes your results:

You can see this question that is about differentioal between DISTINCT and GROUP BY .

And from that:

The GROUP BY query aggregates before it computes. The DISTINCT query computes before the aggregate.

So When you use ROW_NUMBER() -that is a scalar value- if query computes first you will have a unique field for ROW_NUMBER() results and then your DISTINCT will apply over it that in your result it will not find any duplicate row!

And for your results you can use this query

SELECT ticket_id, asgn_grp, 
    (SELECT COUNT([date]) FROM yourTable t WHERE t.asgn_grp = r.asgn_grp And t.ticket_id = r.ticket_id)
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY ticket_id ORDER BY [date]) As ra
    FROM (
        SELECT *, ROW_NUMBER() OVER (PARTITION BY ticket_id, asgn_grp ORDER BY [date] Desc) As rn 
        FROM yourTable) findingOldDates
    WHERE rn = 1) r
WHERE ra = 2

ROW_NUMBER() with DISTINCT

Question

4 answers

solution1
0 2015-01-27 23:02:41

solution2
0 2015-01-27 23:22:14

solution3
0 2015-01-28 00:44:50

solution4
0 2015-04-13 05:09:57

ROW_NUMBER() with DISTINCT

Question

4 answers

solution1 0 2015-01-27 23:02:41

solution2 0 2015-01-27 23:22:14

solution3 0 2015-01-28 00:44:50

solution4 0 2015-04-13 05:09:57

solution1
0 2015-01-27 23:02:41

solution2
0 2015-01-27 23:22:14

solution3
0 2015-01-28 00:44:50

solution4
0 2015-04-13 05:09:57