I have a log table with a row 'path' having values like root/home/desktop/parent/child/grandchild
I want to do group by on this row based on some integer input 'n' where n is the number of slashes till which we want to extract the substring and then do group by on that. For example: in this case if n = 1 I would want to group by 'root/' if n was 3 if would want to group by 'root/home/desktop/'. How can I achieve this in BigQuery? Can I use a regex for the same or is there any better way to achieve this? Would appreciate giving a bit of explanation too for whatever the approach is. Thanks!!
Not sure if below example really requires any extra explanation
select *,
split(path, '/')[safe_offset(0)],
split(path, '/')[safe_offset(1)],
split(path, '/')[safe_offset(2)],
split(path, '/')[safe_offset(3)],
split(path, '/')[safe_offset(4)],
split(path, '/')[safe_offset(5)]
from your_table
with output
I would like to have the splits combined in the form of a string until the last slash...
To get partial path from the beginning - use below example
create temp function get_path(path string, n int64) as ((
select string_agg(part, '/' order by offset)
from unnest(split(path, '/')) part with offset
where offset < n
));
select
get_path(path, 1) n1,
get_path(path, 2) n2,
get_path(path, 3) n3,
get_path(path, 4) n4,
get_path(path, 5) n5,
get_path(path, 6) n6
from your_table
with output like below
In case if you want to use regexp - consider below
create temp function get_path(path string, n int64) as ((
regexp_extract(path, r'(^(?:[^/]+/?){' || n || '})')
));
with your_table as (
select 'root/home/desktop/parent/child/grandchild' path
)
select
get_path(path, 1) n1,
get_path(path, 2) n2,
get_path(path, 3) n3,
get_path(path, 4) n4,
get_path(path, 5) n5,
get_path(path, 6) n6,
from your_table
with output
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.