[英]How can I split a column with nested delimiters into a list of STRUCTs in Bigquery SQL?
我有一列包含这样的条目(可变数量的二元组):
D000001:Term1;D00007:Term19;D00008:Term781
(下面的mesh_terms
列),我想拆分它们,以便我最终得到每行的ARRAY<STRUCT<code STRING, term STRING>>
。
下面的查询根据需要工作,但我很好奇是否有人对可读性、性能(在 Bigquery 上,所以不是太大的问题)或最佳实践方面的改进提出建议。
with t1 as (
SELECT
pmid,
split(mesh_terms, ';') as l1
FROM `omicidx_etl.pm1`
),
t2 as (
select
t1.pmid,
x
from t1,
unnest(t1.l1) as x
),
t3 as (
select
pmid,
split(x, ':') as y
from t2
)
select
pmid,
array_agg(STRUCT(t3.y[offset(0)] as code, t3.y[offset(1)] as term)) as mesh_terms
from
t3
group by pmid
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.