简体   繁体   中英

Possible explanation on WITH RECURSIVE Query Postgres

I have been reading around With Query in Postgres. And this is what I'm surprised with

WITH RECURSIVE t(n) AS (
    VALUES (1)
  UNION ALL
    SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n) FROM t;

I'm not able to understand how does the evaluation of the query work.

  • t(n) it sound like a function with a parameter. how does the value of n is passed.

Any insight on how the break down happen of the recursive statement in SQL.

This is called a common table expression and is a way of expressing a recursive query in SQL:

t(n) defines the name of the CTE as t , with a single column named n . It's similar to an alias for a derived table:

select ... 
from (
  ...
) ;

The recursion starts with the value 1 (that's the values (1) part) and then recursively adds one to it until the 99 is reached. So it generates the numbers from 1 to 99. Then final query then sums up all those numbers.

n is a column name, not a "variable" and the "assignment" happens in the same way as any data retrieval.

WITH RECURSIVE t(n) AS (
    VALUES (1) --
  UNION ALL
    SELECT n+1 FROM t WHERE n < 100 --
)
SELECT sum(n) FROM t;

If you "unroll" the recursion (which in fact is an iteration) then you'd wind up with something like this:

select x.n + 1
from (
  select x.n + 1
  from (
    select x.n + 1
    from (
      select x.n + 1
      from (
         values (1)
      ) as x(n) 
    ) as x(n)
  ) as x(n)
) as x(n)

More details in the manual:
https://www.postgresql.org/docs/current/static/queries-with.html

If you are looking for how it is evaluated, the recursion occurs in two phases.

  1. The root is executed once.
  2. The recursive part is executed until no rows are returned. The documentation is a little vague on that point.

Now, normally in databases, we think of "function" in a different way than we think of them when we do imperative programming. In database terms, the best way to think of a function is "a correspondence where for every domain value you have exactly one corresponding value." So one of the immediate challenges is to stop thinking in terms of programming functions. Even user-defined functions are best thought about in this other way since it avoids a lot of potential nastiness regarding the intersection of running the query and the query planner... So it may look like a function but that is not correct.

Instead the WITH clause uses a different, almost inverse notation. Here you have the set name t , followed (optionally in this case) by the tuple structure (n) . So this is not a function with a parameter, but a relation with a structure.

So how this breaks down:

SELECT 1 as n where n < 100
UNION ALL
SELECT n + 1 FROM (SELECT 1 as n) where n < 100
UNION ALL
SELECT n + 1 FROM (SELECT n + 1 FROM (SELECT 1 as n)) where n < 100

Of course that is a simplification because internally we keep track of the cte state and keep joining against the last iteration, so in practice these get folded back to near linear complexity (while the above diagram would suggest much worse performance than that).

So in reality you get something more like:

 SELECT 1 as n where 1 < 100
 UNION ALL
 SELECT 1 + 1 as n where 1 + 1 < 100
 UNION ALL
 SELECT 2 + 1 AS n WHERE 2 + 1 < 100
 ...

In essence the previous values carry over.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM