简体   繁体   中英

Find lagging rows of a query omitted by a WHERE clause in SQL/SQLite

Setup

I have a table of race times, listing a runner, their team, and their race time:

CREATE TABLE race (person TEXT, team TEXT, timer FLOAT);

INSERT INTO race
  (person, team, timer)
VALUES
  ("ahmed", "red", 4.3),
  ("baadur", "green", 4.4),
  ("carel", "red", 4.5),
  ("dada", "green", 4.9),
  ("eder", "green", 5.0),
  ("farai", "red", 5.1);

I can make a list of all people on the red team and their ranking:

SELECT person, ROW_NUMBER() OVER(ORDER BY timer) AS ranking FROM race WHERE team="red";

which emits

person ranking
ahmed 1
carel 2
farai 3

Question

I want to also get the name of runner who followed each of these red runners, ie, who had the next slowest time—so I want:

person ranking next runner
ahmed 1 baadur
carel 2 dada
farai 3 null

where note how since nobody has a slower time than Farai, Farai's third column is null.

Can I do this efficiently with a single query?

Considerations

I'd like to avoid first getting the list of red runners and their times with one query and then making another three (or more generally N ) queries to get the runner with the next time, eg, this is what I do not want to do:

SELECT person FROM race WHERE timer>=4.3 AND person != "ahmed" LIMIT 1;
SELECT person FROM race WHERE timer>=4.5 AND person != "carel" LIMIT 1;
SELECT person FROM race WHERE timer>=5.1 AND person != "farai" LIMIT 1;
-- 😭

I might be able to rework this above approach to be a single query instead of multiple individual ones but I feel like it should be possible to get what I want in a single query if there was a way to run a subquery for each row that matches the WHERE team="red" clause to find the next row (which could be made fast by an index on timer ), but I'm not sure if that's possible.

For example, could I use SQLite's lag window function to achieve this? By itself, lag would look at rows that met my WHERE team="red" criterion so it wouldn't return the next-slowest runner if they were on the green team or another non-red team.

Is there a general term for this kind of query?

There could be many teams and many runners so I'd like to know how to make this kind of lookup as efficient as possible.

Use LAG here, along with ROW_NUMBER with partition:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY team ORDER BY timer) ranking,
              LEAD(person) OVER (ORDER BY timer) next_runner
    FROM race
)

SELECT person, ranking, next_runner
FROM cte
WHERE team = 'red'
ORDER BY ranking;

下面演示链接的屏幕截图

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM