简体   繁体   中英

Unnest a JSON stringified array in BigQuery

I have the following table in Google BigQuery:

+------------+---------+---------+
|    Name    |  City  | items   |
+------------+---------+
| James     | Dallas   |[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]

| John      | Chicago  |[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]
+------------+---------+

I'm trying to achieve something like this:

+------------+---------+---------+----------------------+--------------+
|    Name    |  City   | text     |  line_total_excl_vat | product_id
+------------+---------+
| James     | Dallas   |  pear    |       24             |       100

| John      | Chicago  |  apple   |       29             |       200

| John      | Chicago  |  banana  |       34             |       300
+------------+---------+

The column "items" is actually a string. Is there a way to unnest this data format and achieve the view that I want in BigQuery? Thanks !

Below is for BigQuery Standard SQL

#standardSQL
SELECT Name, City, 
  JSON_EXTRACT_SCALAR(json, '$.text') AS text,
  JSON_EXTRACT_SCALAR(json, '$.line_total_excl_vat') AS line_total_excl_vat,
  JSON_EXTRACT_SCALAR(json, '$.product_id') AS product_id
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(items,'$')) json   

If to apply to sample data from your question - as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'James' AS Name, 'Dallas' AS City, "[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]" AS items UNION ALL
  SELECT 'John', 'Chicago', "[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]"
)
SELECT Name, City, 
  JSON_EXTRACT_SCALAR(json, '$.text') AS text,
  JSON_EXTRACT_SCALAR(json, '$.line_total_excl_vat') AS line_total_excl_vat,
  JSON_EXTRACT_SCALAR(json, '$.product_id') AS product_id
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(items,'$')) json   

output is

Row Name    City    text    line_total_excl_vat product_id   
1   James   Dallas  pear    24                  100  
2   John    Chicago apple   29                  200  
3   John    Chicago banana  34                  300  

It's a bit of fiddleing with json_extract and json_extract_array in combination with unnest()...

WITH t AS (
  SELECT 'James' as Name, 'Dallas' AS City, "[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]" AS items
  UNION ALL
  SELECT 'John', 'Chicago', "[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]"
)

SELECT 
  # we'll unnest this array in the next statement and grab its elements
  JSON_EXTRACT_ARRAY(items,'$') as arr
  
  # unnest() turns array into table format - jason-function extracts fields from each row
  ,ARRAY(SELECT AS STRUCT
  
      JSON_EXTRACT_SCALAR(i,'$.text') as text,
      JSON_EXTRACT_SCALAR(i,'$.line_total_excl_vat') as line_total_excl_vat,
      JSON_EXTRACT_SCALAR(i,'$.product_id') as product_id
   
   FROM UNNEST(JSON_EXTRACT_ARRAY(items,'$')) as i 
   ) AS unnested_items
   ,* # original fields for reference
FROM t

This creates a nested output that you can use later on (see the JSON representation of the output, it's more clear) - if you want to flatten the table you can lateral join this resulting array...

WITH t AS (
#     Name    |  City  | items   |
  SELECT 'James' as Name, 'Dallas' AS City, "[{'text': 'pear', 'line_total_excl_vat': '24','product_id': 100}]" AS items
  UNION ALL
  SELECT 'John', 'Chicago', "[{'text': 'apple', 'line_total_excl_vat': '29','product_id': 200},{'text': 'banana', 'line_total_excl_vat': '34','product_id': 300}]"
)

SELECT 
   * 
FROM t CROSS JOIN UNNEST(ARRAY((SELECT AS STRUCT
  
      JSON_EXTRACT_SCALAR(i,'$.text') as text,
      JSON_EXTRACT_SCALAR(i,'$.line_total_excl_vat') as line_total_excl_vat,
      JSON_EXTRACT_SCALAR(i,'$.product_id') as product_id
   
   FROM UNNEST(JSON_EXTRACT_ARRAY(items,'$')) as i 
   )))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM