[英]BigQuery: Subquery with UNION as ARRAY
I have the following two example tables我有以下两个示例表
order_id order_id | linked_order1链接订单1 | linked_order2链接订单2 |
---|---|---|
1001 1001 | L005 L005 | null null |
1002 1002 | null null | null null |
1003 1003 | L006 L006 | L007 L007 |
order_id order_id | linked_order_id linked_order_id | charge收费 |
---|---|---|
1001 1001 | null null | 4.27 4.27 |
1002 1002 | null null | 9.82 9.82 |
1003 1003 | null null | 7.42 7.42 |
null null | L005 L005 | 2.12 2.12 |
null null | L006 L006 | 1.76 1.76 |
null null | L007 L007 | 3.20 3.20 |
I need to join these so the charges of all the orders (linked and otherwise) can be shown as part of the single order row.我需要加入这些,以便所有订单(链接和其他)的费用可以显示为单个订单行的一部分。 My desired output is something like this.我想要的 output 是这样的。
order_id order_id | linked_order1链接订单1 | linked_order2链接订单2 | invoices.charge invoices.charge | invoices.order_id invoices.order_id | invoices.linked_order_id invoices.linked_order_id |
---|---|---|---|---|---|
1001 1001 | L005 L005 | null null | 4.27 4.27 | 1001 1001 | null null |
2.12 2.12 | null null | L005 L005 | |||
1002 1002 | null null | null null | 9.82 9.82 | null null | null null |
1003 1003 | L006 L006 | L007 L007 | 7.42 7.42 | null null | null null |
1.76 1.76 | null null | L006 L006 | |||
3.20 3.20 | null null | L007 L007 |
I can manage to get the main order into the table as follows.我可以设法将主要订单放入表中,如下所示。
SELECT
orders,
ARRAY(
SELECT AS STRUCT * FROM `invoices_table` WHERE order=orders.order_id) AS invoice
FROM
`orders_table` AS orders
I can run a separate query to union all of the invoice results into a single table for given order ids but I can't combine this with the above query with out getting errors.对于给定的订单 ID,我可以运行一个单独的查询将所有发票结果合并到一个表中,但我不能将它与上述查询结合起来而不会出错。
Something like this...像这样的东西...
SELECT
orders,
ARRAY(
SELECT AS STRUCT * FROM
(SELECT * FROM `invoices_table` WHERE order=orders.order_id
UNION ALL SELECT * FROM `invoices_table` WHERE linked_order_id=orders.linked_order1
UNION ALL SELECT * FROM `invoices_table` WHERE linked_order_id=orders.linked_order2)
) AS invoice
FROM
`orders_table` AS orders
But this gives me the correlated subqueries error.但这给了我相关的子查询错误。
This is much simpler than I thought.这比我想象的要简单得多。 The following query gives me what I was after.以下查询给出了我所追求的。
SELECT
orders,
ARRAY(
SELECT AS STRUCT * FROM `invoices_table` WHERE order=orders.order_id OR linked_order_id IN(orders.linked_order1, orders.linked_order2)) AS invoice
FROM
`orders_table` AS orders
Using CROSS JOINS,使用交叉连接,
SELECT o.*, ARRAY_AGG(i) invoices
FROM Orders o, Invoices i
WHERE o.order_id = i.order_id
OR i.linked_order_id IN (o.linked_order1, o.linked_order2)
GROUP BY 1, 2, 3;
Sometimes the query using OR conditions in WHERE clause might show poor perfomrance in large dataset.有时,在 WHERE 子句中使用 OR 条件的查询可能会在大型数据集中显示较差的性能。 In that case you may try below query instead that generates same result.在这种情况下,您可以尝试下面的查询,而不是生成相同的结果。
SELECT o.*, ARRAY_AGG(i) invoices FROM (
SELECT o, i FROM Orders o JOIN Invoices i USING (order_id)
UNION ALL
SELECT o, i FROM Orders o JOIN Invoices i ON i.linked_order_id IN (o.linked_order1, o.linked_order2)
) GROUP BY 1, 2, 3;
For the desired output table, the full outer join
is the right command.对于所需的 output 表, full outer join
是正确的命令。
with tblA as (Select order_id, 1 linked_order1, 2 linked_order2, from unnest([1,2,3]) order_id),
tblB as (Select order_id, 109.99 charge from unnest([3,4,5]) order_id
union all select null order_id, * from unnest([50.1,29.99]) charge
)
Select *
from tblA
full join tblB
using(order_id)
For your setting, there is the need to have several joining conditions.对于您的设置,需要有几个加入条件。 Therefore, the first table is used three times, for each joining key.因此,对于每个连接键,第一个表被使用了 3 次。
with tblA as (Select order_id, "L05" linked_order1, "L2" linked_order2, from unnest(["1","2","3"]) order_id),
tblB as (Select order_id, null linked_order_id, 109.99 charge from unnest(["3","4","5"]) order_id
union all select null order_id, "L05" , * from unnest([50.1,29.99]) charge
)
Select A.order_id,linked_order1,linked_order2, array_agg(struct(tblB.order_id,linked_order_id,charge))
from
(
Select * from tblA, unnest([order_id,linked_order1,linked_order2]) as tmp_id
) A
full join tblB
on tmp_id = ifnull(tblB.order_id,linked_order_id)
where charge is not null #or tmp_id=A.order_id
group by 1,2,3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.