简体   繁体   中英

Selecting three rows after the one that meets the condition

We've got 7 guys travelling in Poland ( ;) ). The problem is to find up to three next cities they visited from the time they visited Warsaw. If one guy visits Warsaw twice, it is also counted as a starting point for the next journey. For example guy 1 have had not only a journey - Warsaw, Cracow, Warsaw, Gdansk, but also Warsaw, Gdansk.

Table A

+------+-----------+-----+
| date |   city    | guy |
+------+-----------+-----+
|    2 | Warsaw    |   1 |
|    4 | Cracow    |   1 |
|    5 | Cracow    |   2 |
|    6 | Bialystok |   3 |
|    7 | Warsaw    |   1 |
|    8 | Gdansk    |   1 |
|   10 | Warsaw    |   5 |
|   12 | Cracow    |   5 |
|   14 | Bialystok |   6 |
|   15 | Warsaw    |   7 |
|   20 | Warsaw    |   7 |
+------+-----------+-----+

So the final table for this would look like this:

+-----------+-----------+-----------+-----------+
| Starting | 2nd dest. | 3th dest. | 4th dest. |
+-----------+-----------+-----------+-----------+
| Warsaw    | Cracow    | Warsaw    | Gdansk    |
| Warsaw    | Gdansk    |           |           |
| Warsaw    | Cracow    |           |           |
| Warsaw    | Warsaw    |           |           |
| Warsaw    |           |           |           |
+-----------+-----------+-----------+-----------+

The problem is to create a query that will automatically create the final table from the table A.

There is no problem with finding every starting point, but I have no idea how to find every second destination. It seems too me that there has to be a loop of some kind - the guy has to be the same as in the starting point and the date of the second destination has to greater than the date of THIS EXACT starting point.

Any help with solving this would be appreciated. ;)

SQLFiddle with some more example entry data - http://sqlfiddle.com/#!2/de0f1 The data above is just a sample, the solution needs to deal with a lot bigger set.

Solution using Row_Number() and Pivot :

select guy,[1] as First, IsNULL([2], '') as Second, IsNUll([3], '') as Third, IsNull([4], '') as Forth
from (
    select row_number() over (partition by guy order by guy, date) as number, city, guy
     from voyage 
     ) x
pivot 
(
  max(city)
  for number in ([1], [2], [3], [4])
) as p

Demo here

My example assumes that the dates are of type int but this could be easily reworked to use proper dates....

EDIT:

This will give the desired output (not sure that this will be a generic approach...):

select * 
from 
(
select guy,[1] as First,  IsNULL([2], '')  as Second, IsNUll([3], '') as Third, IsNull([4], '') as Forth
from (
select row_number() over (partition by guy order by guy, date) as number, city, guy
from voyage ) x
pivot 
(
max(city)
  for number in ([1], [2], [3], [4])
) as p
where  [1] = 'Warsaw'

union All

select guy,[3] as First, case when [4] = 'Warsaw' then '' else IsNULL([4], '') end as Second, '' as Third, '' as Forth
from (
select row_number() over (partition by guy order by guy, date) as number, city, guy
from voyage ) x
pivot 
(
max(city)
  for number in ([1], [2], [3], [4])
) as p
where  [3] = 'Warsaw'

Union all


select guy,[2] as First, '' as Second, '' as Third, '' as Forth
from (
select row_number() over (partition by guy order by guy, date) as number, city, guy
from voyage ) x
pivot 
(
max(city)
  for number in ([1], [2], [3], [4])
) as p
where  [2] = 'Warsaw'  
  ) bigdata
order by guy

Second Demo Here

If you are using SQL Server 2012 or later version, you can fairly easily solve the problem with the help of the LEAD() analytic function :

WITH ThreeDestinations AS (
  SELECT
    *,
    Destination2 = LEAD(city, 1) OVER (PARTITION BY guy ORDER BY date),
    Destination3 = LEAD(city, 2) OVER (PARTITION BY guy ORDER BY date),
    Destination4 = LEAD(city, 3) OVER (PARTITION BY guy ORDER BY date)
  FROM
    dbo.voyage
)
SELECT
  StartingPoint = city,
  Destination2,
  Destination3,
  Destination4
FROM
  ThreeDestinations
WHERE
  city = 'Warsaw'
ORDER BY
  date
;

The three calls to LEAD give you the first three (or fewer) destinations after every city in the original set. The next, and final, step is just to filter out the rows where the starting point is not Warsaw.

This should work. I have used Cte's to query out first travel of a person from Warsaw and second travel of a person from Warsaw. Finally used Pivot to union and display the result sets in the requested format. Hope this is useful.

;WITH cte_Itinerary([Date], City, Guy)
    AS (SELECT '2', 'Warsaw', '1'
        UNION ALL
        SELECT '4', 'Cracow', '1'
        UNION ALL
        SELECT '5', 'Cracow', '2'
        UNION ALL
        SELECT '6', 'Bialystok', '3'
        UNION ALL
        SELECT '7', 'Warsaw', '1'
        UNION ALL
        SELECT '8', 'Gdansk', '1'
        UNION ALL
        SELECT '10', 'Warsaw', '5'
        UNION ALL
        SELECT '12', 'Cracow', '5'
        UNION ALL
        SELECT '14', 'Bialystok ', '6'
        UNION ALL
        SELECT '15', 'Warsaw', '7'
        UNION ALL
        SELECT '20', 'Warsaw', '7'
    )
, cte_AddRowNumber
    AS (
    SELECT
           ROW_NUMBER() OVER (PARTITION BY Guy ORDER BY Guy, [Date]) AS number,
           City,
           Guy
    FROM cte_Itinerary
    )
, cte_FilterByWarsaw
    AS (
    SELECT
           ROW_NUMBER() OVER (PARTITION BY Guy ORDER BY Guy, [Date]) AS number,
           City,
           Guy
    FROM cte_Itinerary
    WHERE City = 'Warsaw'
    )
, cte_Result_AllDestinations
    AS (
    SELECT DISTINCT
           'One' AS ResultType,
           o.Number,
           o.City,
           o.Guy
    FROM cte_AddRowNumber AS o
         JOIN cte_FilterByWarsaw AS t
             ON o.Guy = t.Guy
            AND t.number <= o.number
    )
, cte_Result_SecondDestinations
    AS (
    SELECT 'Two' AS ResultType,
           ROW_NUMBER() OVER (PARTITION BY r1.Guy ORDER BY r1.Guy) AS number,
           r1.City,
           r1.Guy
    FROM cte_Result_AllDestinations AS r1
         JOIN cte_Result_AllDestinations AS r2
             ON r1.Guy = r2.Guy
            AND r1.number >= r2.number
    WHERE r2.City = 'Warsaw'
      AND r2.number <> 1
    )
    SELECT Guy,
           [1] AS Starting,
           ISNULL([2], '') AS [2nd Dest.],
           ISNULL([3], '') AS [3th Dest.],
           ISNULL([4], '') AS [4th Dest.]
    FROM(SELECT ResultType,Number,City,Guy
         FROM cte_Result_AllDestinations
         UNION ALL
         SELECT ResultType,Number,City,Guy
         FROM cte_Result_SecondDestinations)AS X 
         PIVOT(MAX(City)FOR NUMBER IN([1], [2], [3], [4]))AS P;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM