简体   繁体   中英

How to Group By all columns of table A when joining two tables to create one big nested table on BigQuery?

I'm trying to create a big nested table that would be composed of many tables, such as my Clients table, Phone Numbers, Emails... They all have in common the client_id field. For the moment I have to following query that works well ("join" the Clients table fields and the according Phone Numbers fields):

SELECT Clients.*, ARRAY_AGG( STRUCT(Timestamp, Country_Code, Local_Number, Phone_Number, Whatsapp)) as Phones
FROM Clients LEFT JOIN Phones USING(client_id)
GROUP BY Client.client_id, Clients.Timestamp, Clients.First_Name, Clients.Last_Name, Clients.DOB

Client.client_id, Clients.Timestamp, Clients.First_Name, Clients.Last_Name, Clients.DOB are all my fields in Clients table. I would like to use this query as subquery to "join" it to the Emails table in a similar way (using with and renaming the result of the subquery). The Thing is that I would like to GROUP BY all the fields of Clients table without writing them all every time. Neither GROUP BY Clients.* nor GROUP BY ALL work...

What can I do to shorten this?

If client_id is unique, then you can just aggregate by that. What you want is to get all the columns when you do that. A very BigQuery'ish approach is:

SELECT ANY_VALUE(c).*, 
       ARRAY_AGG( STRUCT(p.Timestamp, p.Country_Code, p.Local_Number, p.Phone_Number, p.Whatsapp)) as Phones
FROM Clients c LEFT JOIN
     Phones p
     USING (client_id)
GROUP BY c.client_id;

This works fine when I run it:

WITH clients as (
      select 'x' as name, 1 as client_id union all 
      select 'y' as name, 2 as client_id
    ),
    phones as (
        select current_timestamp as timestamp, 1 as client_id, 'abc' as country_code, 111 as local_number, 222 as phone_number, null as whatsapp
    )
SELECT ANY_VALUE(c).*, 
       ARRAY_AGG( STRUCT(p.Timestamp, p.Country_Code, p.Local_Number, p.Phone_Number, p.Whatsapp)) as Phones
FROM Clients c LEFT JOIN
     Phones p
     USING (client_id)
GROUP BY c.client_id;

Try below

SELECT ANY_VALUE(c).*, 
  ARRAY_AGG((SELECT AS STRUCT p.* EXCEPT(client_id))) as Phones
FROM Clients c 
LEFT JOIN Phones p
USING (client_id)
GROUP BY c.client_id

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM