简体   繁体   中英

accessing values in an ARRAY STRUCT in Google Bigquery

I am new to querying in Google BigQuery and am attempting to flatten an ARRAY field in my query so that the array values are listed as a single result in a comma separated list. In my query "associations.associatedvids" is an array field in the deals table. My issue is really is a 2 step problem as I also need to match the associatedvids with the corresponding first and last name fields in another table called contacts. First, for the contact ids, when I do the following

Select
CAST(property_hs_object_id.value AS String) AS deal_ID,
associations.associatedvids AS associated_contacts_ID
From hubspot_data.deals

I get a result like this:

Row    deal_ID         associated_contacts_ID.value 
1      1814103617      3240001
                       3239951
...

but what I want is:

Row    deal_ID         associated_contacts_ID.value 
1      1814103617      3240001,3239951
...

I've tried different ways of unnesting the array, but cannot seem to get it right. For instance the following attempt returns the error "Scalar subquery produced more than one element".

Select
CAST(property_hs_object_id.value AS String) AS deal_ID,
(select associations.associatedvids from unnest(associations.associatedvids)) AS associated_contacts_ID
From hubspot_data.deals

Second, what I ultimately want is:

Row    deal_ID         associated_contact_names 
1      1814103617      John Doe,Jane Doe
...

The names fields are property_firstname.value and property_lastname.value, and associations.associatedvids (data type ARRAY<STRUCT>)=contacts.vids (data type INT64). I've tired the following, but since the data types are different I'm getting an error.

Select
CAST(property_hs_object_id.value AS String) AS deal_ID,
(select concat(property_firstname.value, " ", property_lastname.value)
 from hubspot_data.contacts
 where contacts.vid=associations.associatedvids) AS contact_name
From hubspot_data.deals

Any guidance would be much appreciated!

EDIT: Here's is my attempt at a minimal working example piece of code. I believe the field I'm trying to query is an ARRAY of STURCTs with the data type of the Struct element I want being INT64.

WITH deals AS (
  Select "012345" as deal_ID,
    [STRUCT(["abc"] as company_ID, [123,678,810] as contact_ID)]
      AS associations)
SELECT 
  deal_ID,
  contacts
FROM deals d
CROSS JOIN UNNEST(d.associations) as contacts

this give me:

Row    deal_ID    contacts.company_ID    contacts.contact_ID    
1      012345     abc                    123
                                         678
                                         810

but what I want is

Row    deal_ID    contacts.contact_ID   
1      012345     123, 678, 810

And ultimately, I need to replace the contact_IDs with the contact first and last names that are in a different table (but fortunately not in an array).

Below is for BigQuery Standard SQL

Based on limited info in your question - I guess you are missing STRING_AGG in the second query you presented in your question

It should be

SELECT
  CAST(property_hs_object_id.value AS String) AS deal_ID,
  (SELECT STRING_AGG(associations.associatedvids) FROM UNNEST(associations.associatedvids)) AS associated_contacts_ID
FROM hubspot_data.deals   

Update: answer on updated question

#standardSQL
SELECT 
  deal_ID,
  ARRAY(
    SELECT AS STRUCT 
      company_ID, 
      ( SELECT STRING_AGG(CAST(id AS STRING), ', ') 
        FROM t.contact_ID id
      ) AS contact_ID 
    FROM d.associations t
  ) AS contacts
FROM deals d

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM