[英]How to append different PubSub objects and flatten them to write them altogether into bigquery as a single JSON?
[英]How to flatten a colum of large JSON strings with different numbers of keys to a table in bigquery
我有一個 google bigquery 表,其中有一列包含大型 JSON 字符串。 在每一行中,有不同數量的鍵和嵌套鍵,我想將它們展平成列。
我的表如下所示:
ID | 有效載荷 |
---|---|
1 | {"key1":{"value":"1"},"key2":2,"key3":1,"key4":"abcde,"version":10} |
2 | {"key1":{"value":"2"},"key2":5,"key3":2,"key4":"defg,"version":11} |
我已經設法通過使用 bq 函數 JSON_EXTRACT_VALUE 和/或 JSON_EXTRACT_SCALAR 來提取單個列:
SELECT id, JSON_EXTRACT_VALUE(payload, '$.key1') as key1
FROM `project.dataset.table`
等等,但是我不想編寫超過 100 個嵌套在 JSON 列中的鍵。 一定有更好的方法!
我很感激任何形式的支持!
考慮以下方法
create temp function extract_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function extract_values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
create temp function extract_all_leaves(input string) returns string language js as '''
function flattenObj(obj, parent = '', res = {}){
for(let key in obj){
let propName = parent ? parent + '.' + key : key;
if(typeof obj[key] == 'object'){
flattenObj(obj[key], propName, res);
} else {
res[propName] = obj[key];
}
}
return JSON.stringify(res);
}
return flattenObj(JSON.parse(input));
''';
create temp table temp_table as (
select offset, key, value, id
from your_table t,
unnest([struct(extract_all_leaves(payload) as leaves)]),
unnest(extract_keys(leaves)) key with offset
join unnest(extract_values(leaves)) value with offset
using(offset)
);
execute immediate (select '''
select * from (select * except(offset) from temp_table)
pivot (any_value(value) for replace(key, '.', '__') in (''' || keys_list || '''
))'''
from (select string_agg('"' || replace(key, '.', '__') || '"', ',' order by offset) keys_list from (
select key, min(offset) as offset from temp_table group by key
))
);
如果應用於您的問題中的示例數據
output 是
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.