I'm trying to get the last item in an array after I split a string. I'd do this easily in Javascript using url.split('//')[url.split('//').length-1]
But how to do in SQL running on AWS Athena (which I believe is actually Proton)
// imagine a url is like 'http://www.google.com'
SELECT * SPLIT(url, '//')[2]
FROM table
Would result in www.google.com
But in some instance there is no result, so I need to use [1]
and not [2]
.
// imagine a url is like 'www.google.com'
SELECT * SPLIT(url, '//')[2]
FROM table
This would result in Error
.
How do I get the last item in the array?
You can use a non-capturing group in regular expression
select regexp_extract('www.google.com', '(?:https?://)?(.*)',1)
and
select regexp_extract('http://www.google.com', '(?:https?://)?(.*)',1)
will both return www.google.com
Please note that the protocol can be either HTTP or HTTPS and the above regex also expects to have the s
or not.
You have 2 options:
Use the element_at
function, both will give the desired output:
SELECT ELEMENT_AT(SPLIT('http://www.google.com', '//'), -1)
SELECT ELEMENT_AT(SPLIT('www.google.com', '//'), -1)
The second option is using the slice
function to get a subset of the array. Both will give the desired output:
SELECT SLICE(SPLIT('http://www.google.com', '//'), -1, 1)[1]
SELECT SLICE(SPLIT('www.google.com', '//'), -1, 1)[1]
-1 mentions the first index of the sub-array
1 is the length of the sub-array
To read more about how slice
function works: https://trino.io/docs/current/functions/array.html#slice
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.