繁体   English   中英

Google BigQuery SQL:从 JSON(列表和数组)中提取数据到列中

[英]Google BigQuery SQL: Extract data from JSON (list and array) into columns

我有 json 字符串的表

UserID  json_string
100      [{"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08- 
           16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}]
100      [{"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}]
100      [{"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}]
200      [{"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}]
200      [{"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-08- 
          16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}]
200      [{"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}]

最后,我需要将字符串转换为列:

UserID  ID         value    os_type   amount    created_at                  updated_at                  Type_name
100    77379513    35.4566  null    200    2020-08-16T14:48:27.611-04:00    2020-08-16T14:48:27.611-04:00   same
100    77379514    38.658   null    100    2020-08-16T14:48:27.611-04:00    2020-08-16T14:48:27.611-04:01   niko
100    77379515    40.569   null    150    2020-08-16T14:48:27.611-04:00    2020-08-16T14:48:27.611-04:02   koko
200    77378899   25.365    null    100    2020-09-16T14:48:27.611-04:01    2020-08-17T14:48:27.611-04:03   same
200    77378900   35.898    null    500    2020-09-16T14:48:27.611-04:02    2020-08-17T14:48:27.611-04:04   niko
200    77378901   41.258    null    400    2020-09-16T14:48:27.611-04:03    2020-08-17T14:48:27.611-04:05   koko

首先,我尝试从列表中提取 JSON:

SELECT iUserID,json_extract_array(json_string) as json_array
FROM `project.dataset.table` 

然后我得到一张这样的桌子:

UserID                              json_array
100     {"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}
100     {"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}
100     {"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}
200     {"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-09- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "same'}
200     {"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-09- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "niko'}
200     {"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-09- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "koko'}

从这一步开始,我尝试使用 function JSON_EXTRACT_SCALAR,但我收到一个错误,即这个 function 不适用于数组。 那么将数据提取到列的正确方法是什么?

以下将为您工作

select UserID, 
  json_extract_scalar(json, '$.id') as id,
  json_extract_scalar(json, '$.value') as value,
  json_extract_scalar(json, '$.os_type') as os_type,
  json_extract_scalar(json, '$.amount') as amount,
  json_extract_scalar(json, '$.created_at') as created_at,
  json_extract_scalar(json, '$.updated_at') as updated_at,
  json_extract_scalar(json, '$.Type_name') as Type_name
from `project.dataset.table`,
unnest(json_extract_array(json_string, '$')) json       

如果适用于您问题中的示例数据

with `project.dataset.table` as (
  select 100 UserID, '[{"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same"}]' json_string union all
  select 100, '[{"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko"}]' union all
  select 100, '[{"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko"}]' union all
  select 200, '[{"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same"}]' union all
  select 200, '[{"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko"}]' union all
  select 200, '[{"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko"}]' 
)

output 是

在此处输入图像描述

注意:您在少数地方使用了'而不是"所以这在上面使用的示例数据中是“固定的”

如果您无法控制此表中的值并且无法将'固定为"您可以使用下面的代替

select UserID, 
  json_extract_scalar(json, '$.id') as id,
  json_extract_scalar(json, '$.value') as value,
  json_extract_scalar(json, '$.os_type') as os_type,
  json_extract_scalar(json, '$.amount') as amount,
  json_extract_scalar(json, '$.created_at') as created_at,
  json_extract_scalar(json, '$.updated_at') as updated_at,
  json_extract_scalar(json, '$.Type_name') as Type_name
from `project.dataset.table`,
unnest(json_extract_array(replace(json_string, "'", '"'), '$')) json 

请注意unnest内部的更改,它使用'处理该问题

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM