[英]Split JSON into columns in a dynamic way in Big Query
I have the following JSON:我有以下 JSON:
{
"rewards": {
"reward_1": {
"type": "type 1",
"amount": "amount 1"
},
"reward_2": {
"type": "type 2",
"amount": "amount 2"
},
"reward_3": {
"type": "type 3",
"amount": "amount 3"
},
"reward_4": {
"type": "type 4",
"amount": "amount 4"
}
}
}
This JSON is dynamic and I don't necessarily know how many rewards it will get, here it's 4 but it can be 2 or 8 etc.这个 JSON 是动态的,我不一定知道它会得到多少奖励,这里是 4,但它可以是 2 或 8 等等。
I want to write a query in Big Query that will parse those values dynamically without knowing how many of them exist, and then split them into column, like this:我想在 Big Query 中编写一个查询,它将动态解析这些值,而不知道它们中有多少存在,然后将它们分成列,如下所示:
Hope these are helpful.希望这些是有帮助的。
max_reward
UDF.) (我使用了正则表达式和max_reward
UDF。)rewards
field in an iterative way.然后,以迭代的方式从 json rewards
字段中提取每个奖励。PIVOT
query.最后,使用PIVOT
查询使结果为宽格式。If you want a more generic solution, you need to use BigQuery dynamic SQL to generate PIVOT columns.如果您想要更通用的解决方案,则需要使用 BigQuery 动态 SQL 来生成 PIVOT 列。 I've hard-coded them in the query.我在查询中对它们进行了硬编码。
('reward_1', 'reward_2', 'reward_3', 'reward_4')
CREATE TEMP TABLE sample AS
SELECT 1 AS id, '{"rewards": { "reward_1": { ... ' AS json -- put your json here
UNION ALL
SELECT 2 AS id, '{"rewards": { "reward_1": { ... ' AS json -- put your another json here
;
CREATE TEMP FUNCTION extract_reward(json STRING, seq INT64)
RETURNS STRUCT<type STRING, amount STRING>
LANGUAGE js AS """
return JSON.parse(json)['reward_' + seq];
""";
CREATE TEMP FUNCTION max_reward(arr ARRAY<STRING>) AS ((
SELECT MAX(CAST(v AS INT64)) FROM UNNEST(arr) v
));
SELECT * FROM (
SELECT id,
'reward_' || seq AS reward,
extract_reward(FORMAT('%t', JSON_QUERY(json, '$.rewards')), seq) AS value
FROM sample, UNNEST(GENERATE_ARRAY(1, max_reward(REGEXP_EXTRACT_ALL(json, r'"reward_([0-9]+)"')))) seq
) PIVOT (ANY_VALUE(value) FOR reward IN ('reward_1', 'reward_2', 'reward_3', 'reward_4'));
reward
STRUCT column into separate columns ▶ 将reward
STRUCT 列拆分为单独的列SELECT * FROM (
SELECT id,
'reward_' || seq || '_' || IF (offset = 0, 'type', 'amount') AS reward,
value
FROM sample,
UNNEST(GENERATE_ARRAY(1, max_reward(REGEXP_EXTRACT_ALL(json, r'"reward_([0-9]+)"')))) seq,
UNNEST([extract_reward(FORMAT('%t', JSON_QUERY(json, '$.rewards')), seq)]) pair,
UNNEST([pair.type, pair.amount]) value WITH OFFSET
) PIVOT (ANY_VALUE(value) FOR reward IN ('reward_1_type', 'reward_2_type', 'reward_3_type', 'reward_4_type', 'reward_1_amount', 'reward_2_amount', 'reward_3_amount', 'reward_4_amount'));
output:输出:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.