简体   繁体   English

我可以在同一个 WITH 查询中选择多个表吗?

[英]Can I select several tables in the same WITH query?

I have a long query with a with structure.我有一个with结构的长查询。 At the end of it, I'd like to output two tables.最后,我想输出两个表。 Is this possible?这可能吗?

(The tables and queries are in snowflake SQL by the way.) (顺便说一下,表和查询都在雪花 SQL 中。)

The code looks like this:代码如下所示:

with table_a as (
               select id, 
                      product_a
               from x.x ),
     table_b as (
               select id, 
                      product_b
               from x.y ),
     table_c as ( 

..... many more alias tables and subqueries here .....

             )

select * from table_g where z = 3 ;

But for the very last row, I'd like to query table_g twice, once with z = 3 and once with another condition, so I get two tables as the result.但是对于最后一行,我想查询 table_g 两次,一次使用 z = 3,一次使用另一个条件,因此我得到两个表作为结果。 Is there a way of doing that (ending with two queries rather than just one) or do I have to re-run the whole code for each table I want as output?有没有办法做到这一点(以两个查询而不是一个查询结束),或者我是否必须为我想要作为输出的每个表重新运行整个代码?

One query = One result set.一个查询 = 一个结果集。 That's just the way that RDBMS's work.这就是 RDBMS 的工作方式。

A CTE ( WITH statement) is just syntactic sugar for a subquery. CTE( WITH语句)只是子查询的语法糖。

For instance, a query similar to yours:例如,类似于您的查询:

with table_a as (
               select id, 
                      product_a
               from x.x ),
     table_b as (
               select id, 
                      product_b
               from x.y ),
     table_c as (     
               select id, 
                      product_c
               from x.z ),

select * 
from table_a
   inner join table_b on table_a.id = table_b.id
   inner join table_c on table_b.id = table_c.id;

Is 100% identical to: 100% 相同于:

select *
from
  (select id, product_a from x.x) table_a
  inner join (select id, product_b from x.y) table_b
      on table_a.id = table_b.id
  inner join (select id, product_c from x.z) table_c
      on table_b.id = table_c.id

The CTE version doesn't give you any extra features that aren't available in the non-cte version (with the exception of a recursive cte) and the execution path will be 100% the same (EDIT: Please see Simon's answer and comment below where he notes that Snowflake may materialize the derived table defined by the CTE so that it only has to perform that step once should the CTE be referenced multiple times in the main query). CTE 版本不会为您提供非 cte 版本中不可用的任何额外功能(递归 cte 除外),并且执行路径将 100% 相同(编辑:请参阅 Simon 的回答和评论下面他指出,Snowflake 可能会具体化 CTE 定义的派生表,因此如果在主查询中多次引用 CTE,它只需执行一次该步骤)。 As such there is still no way to get a second result set from the single query.因此,仍然无法从单个查询中获得第二个结果集。

While they are the same syntactically, they don't have the same performance plan.虽然它们在语法上是相同的,但它们没有相同的性能计划。

The first case can be when one of the stages in the CTE is expensive, and is reused via other CTE's or join to many times, under Snowflake, use them as a CTE I have witness it running the "expensive" part only a single time, which can be good so for example like this.第一种情况可能是 CTE 中的一个阶段很昂贵,并且通过其他 CTE 重复使用或多次加入,在 Snowflake 下,将它们用作 CTE 我亲眼目睹它只运行“昂贵”部分一次,这可能很好,例如像这样。

WITH expensive_select AS (
    SELECT a.a, b.b, c.c
    FROM table_a AS a
    JOIN table_b AS b
    JOIN table_c AS c
    WHERE complex_filters
), do_some_thing_with_results AS (
    SELECT stuff
    FROM expensive_select
    WHERE filters_1
), do_some_agregation AS (
    SELECT a, SUM(b) as sum_b
    FROM expensive_select
    WHERE filters_2
)
SELECT a.a
    ,a.b
    ,b.stuff
    ,c.sum_b
FROM expensive_select AS a
LEFT JOIN do_some_thing_with_results AS b ON a.a = b.a
LEFT JOIN do_some_agregation AS c ON a.a = b.a;

This was originally unrolled, and the expensive part was some VIEWS that the date range filter that was applied at the top level were not getting pushed down (due to window functions) so resulted in full table scans, multiple times.这最初是展开的,昂贵的部分是一些视图,在顶层应用的日期范围过滤器没有被推下(由于窗口函数),因此导致多次全表扫描。 Where pushing them into the CTE the cost was paid once.在将它们推入 CTE 时,成本已支付一次。 (In our case putting date range filters in the CTE made Snowflake notice the filters and push them down into the view, and things can change, a few weeks later the original code ran as good as the modified, so they "fixed" something) (在我们的例子中,在 CTE 中放置日期范围过滤器使 Snowflake 注意到过滤器并将它们下推到视图中,事情可能会发生变化,几周后原始代码与修改后的代码一样好,所以他们“修复”了一些东西)

In other cases, like this the different paths that used the CTE use smaller sub-sets of the results, so using the CTE reduced the remote IO so improved performance, there then was more stalls in the execution plan.在其他情况下,像这样使用 CTE 的不同路径使用较小的结果子集,因此使用 CTE 减少了远程 IO,从而提高了性能,然后执行计划中会有更多的停顿。

I also use CTEs like this to make the code easier to read, but giving the CTE a meaningful name, but the aliasing it to something short, for use.我也使用这样的 CTE 使代码更易于阅读,但给 CTE 一个有意义的名称,但将其别名为简短的内容以供使用。 Really love that.真的很喜欢那个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM