简体   繁体   中英

How to convert JSON string column row into a queryable table

I have exported to BigQuery from Firestore a whole collection to perform certain queries on it.

After the data was populated in my BigQuery console, now I can query the whole set like this

SELECT * 
FROM `myapp-1a602.firestore_orders.orders_raw_changelog` 
LIMIT 1000

Now, this statement throws my different columns, but the one I'm looking for is the data column, in my data column is each document JSON, but is in json format and I need to query all this values.

Now, this is the data from one row

{
    "cart": [{
        "qty": 1,
        "description": "Sprite 1 L",
        "productName": "Sprite 1 Liter",
        "price": 1.99,
        "productId": 9
    }],
    "storeName": "My awesome shop",
    "status": 5,
    "timestamp": {
        "_seconds": 1590713204,
        "_nanoseconds": 916000000
    }
}

This data is inside the data column, so if I do this

SELECT data 
FROM `myapp-1a602.firestore_orders.orders_raw_changelog` 
LIMIT 1000

I will get all the json values for each document, but I don't know how to query that values, lets say I want to know all orders with status 5 and shopName My awesome shop , now, I need to do something with this json to convert it into a table? does I need to perform the query in the json itself?

How can I query this json output?

Thanks

You canwork with the json functiosn like the

CrEATE Table products (id Integer,attribs_json JSON );
 INSERT INTO products VALUES (1,'{ "cart": [{ "qty": 1, "description": "Sprite 1 L", "productName": "Sprite 1 Liter", "price": 1.99, "productId": 9 }], "storeName": "My awesome shop", "status": 5, "timestamp": { "_seconds": 1590713204, "_nanoseconds": 916000000 } }');
 select * from products where attribs_json->"$.status" = 5 AND attribs_json->"$.storeName" = 'My awesome shop';
 id | attribs_json -: |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 1 |  {"cart": [{"qty": 1, "price": 1.99, "productId": 9, "description": "Sprite 1 L", "productName": "Sprite 1 Liter"}], "status": 5, "storeName": "My awesome shop", "timestamp": {"_seconds": 1590713204, "_nanoseconds": 916000000}} 

db<>fiddle here

select attribs_json->"$.storeName",attribs_json->"$.status",attribs_json->"$.cart[0].qty" from products where attribs_json->"$.status" = 5 AND attribs_json->"$.storeName" = 'My awesome shop';
 attribs_json->"$.storeName" |  attribs_json->"$.status" |  attribs_json->"$.cart[0].qty":-------------------------- |:----------------------- |:---------------------------- "My awesome shop" |  5 |  1 

db<>fiddle here

And there is JSON_EXTRACT for mysql 5.7 and above.

Finally that is in the end only text, so you could use also REGEXP or RLIKE

To transfer the jaso again to rows, you can use JSON_TABLE

I need to do something with this json to convert it into a table? does I need to perform the query in the json itself?

Below is for BigQuery Standard SQL

#standardSQL
SELECT * EXCEPT(data, cart_item), 
  JSON_EXTRACT(data, '$.status') AS status, 
  JSON_EXTRACT(data, '$.storeName') AS storeName,
  JSON_EXTRACT(cart_item, '$.qty') AS qty,
  JSON_EXTRACT(cart_item, '$.description') AS description,
  JSON_EXTRACT(cart_item, '$.productName') AS productName,
  JSON_EXTRACT(cart_item, '$.price') AS price,
  JSON_EXTRACT(cart_item, '$.productId') AS productId
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(data, '$.cart')) cart_item   

If to apply to sample data from your question as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 order_id, '''
{
    "cart": [{
        "qty": 1,
        "description": "Sprite 1 L",
        "productName": "Sprite 1 Liter",
        "price": 1.99,
        "productId": 9
    },{
        "qty": 2,
        "description": "Fanta 1 L",
        "productName": "Fanta 1 Liter",
        "price": 1.99,
        "productId": 10
    }],
    "storeName": "My awesome shop",
    "status": 5,
    "timestamp": {
        "_seconds": 1590713204,
        "_nanoseconds": 916000000
    }
}  
'''  data 
)
SELECT * EXCEPT(data, cart_item), 
  JSON_EXTRACT(data, '$.status') AS status, 
  JSON_EXTRACT(data, '$.storeName') AS storeName,
  JSON_EXTRACT(cart_item, '$.qty') AS qty,
  JSON_EXTRACT(cart_item, '$.description') AS description,
  JSON_EXTRACT(cart_item, '$.productName') AS productName,
  JSON_EXTRACT(cart_item, '$.price') AS price,
  JSON_EXTRACT(cart_item, '$.productId') AS productId
FROM `project.dataset.table`,
UNNEST(JSON_EXTRACT_ARRAY(data, '$.cart')) cart_item   

result is

Row order_id    status  storeName           qty     description     productName         price   productId    
1   1           5       "My awesome shop"   1       "Sprite 1 L"    "Sprite 1 Liter"    1.99    9    
2   1           5       "My awesome shop"   2       "Fanta 1 L"     "Fanta 1 Liter"     1.99    10   

What you must do is to extract the values from the json data as: SELECT....... WHERE data->'$.storeName'= "My awesome shop" and data->'$.status' = 5

Extracting from the 'cart' or ´the 'timestamp' keys will give you a Json object that needs further extracting to get the data. I hope it'll help you You probably want to have a look at the MySql documentation ( https://dev.mysql.com/doc/refman/8.0/en/json.html ) or https://www.mysqltutorial.org/mysql-json/ .

You can use UNNEST in the WHERE clause to access the cart's columns, and JSON_EXTRACT functions in the WHERE clause to filter the rows wanted. You need to take care on accessing either the json root or the array cart ; json_data and cart_items in the example below (by the way, in your example shopName doesn't exist but storeName does).

WITH
  `myapp-1a602.firestore_orders.orders_raw_changelog` AS (
  SELECT
    '{"cart": [{"qty": 1,"description": "Sprite 1 L","productName": "Sprite 1 Liter","price": 1.99,"productId": 9}, {"qty": 11,"description": "Sprite 11 L","productName": "Sprite 11 Liter","price": 11.99,"productId": 19}],"storeName": "My awesome shop","status": 5,"timestamp": {"_seconds": 1590713204,"_nanoseconds": 916000000}}' json_data )
SELECT
  JSON_EXTRACT(json_data, '$.status') AS status,
  JSON_EXTRACT(json_data, '$.storeName') AS storeName,
  JSON_EXTRACT(cart_items, '$.productName') AS product,
  JSON_EXTRACT_SCALAR(cart_items, '$.qty') AS qty
FROM
  `myapp-1a602.firestore_orders.orders_raw_changelog`,
  UNNEST(JSON_EXTRACT_ARRAY(json_data, '$.cart')) AS cart_items
WHERE
  JSON_EXTRACT(json_data,'$.storeName') like "\"My awesome shop\"" AND 
  CAST(JSON_EXTRACT_SCALAR(json_data,'$.status') AS NUMERIC) = 5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM