[英]Finding part of string and extracting data between delimiter using BigQuery SQL
我有一個這樣的專欄:
String_to_Extract |
---|
A~S1_B~S2_C~S11 |
A~S1_B~S3_C~S12 |
C~S13_A~S11_B~S4 |
“~”之前的部分應該是列名。 “~”后面的部分應該是行值。 這由“_”分隔。 因此,結果應如下所示:
String_to_Extract | 一個 | 乙 | C |
---|---|---|---|
A~S1_B~S2_C~S11 | S1 | S2 | S11 |
A~S1_B~S3_C~S12 | S1 | S3 | S12 |
C~S13_A~S11_B~S4 | S11 | S4 | S13 |
這是我的方法:
SELECT
String_to_Extract,
SUBSTRING(String_to_Extract, INSTR(Advertiser, "A~")+2, ?) AS A,
SUBSTRING(String_to_Extract, INSTR(Advertiser, "B~")+2, ?) AS B,
SUBSTRING(String_to_Extract, INSTR(Advertiser, "C~")+2, ?) AS C,
From Table
如何獲得每列的 ~ 和下一個 _ 之間的部分?
很高興得到幫助!
一種方法使用REGEXP_EXTRACT
:
SELECT
REGEXP_EXTRACT(String_to_Extract, r"(?:^|_)A~([^_]+)") AS A,
REGEXP_EXTRACT(String_to_Extract, r"(?:^|_)B~([^_]+)") AS B,
REGEXP_EXTRACT(String_to_Extract, r"(?:^|_)C~([^~]+)") AS C
FROM yourTable;
您也可以使用這種方法,首先訂購拆分的項目,然后選擇值:
select
split(ordered[safe_offset(0)], '~')[safe_offset(1)] as A,
split(ordered[safe_offset(1)], '~')[safe_offset(1)] as B,
split(ordered[safe_offset(2)], '~')[safe_offset(1)] as C
from (
select
array(select _ from unnest(split(Advertiser, '_') ) as _ order by 1) as ordered
from dataset.table
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.