Documentation says:
regexp_extract(string, pattern)
→ varchar
Returns the first substring matched by the regular expression pattern in string https://prestodb.io/docs/current/functions/regexp.html
I have the following query
select regexp_extract('sssshttps://jira.domain.com/browse/PR-6835hhhh',
'/(https.*\/browse\/)(\w+\-\d+)/g')
This return null
when it should return https://jira.domain.com/browse/PR-6835
The regexp seems OK it works in many regexp tools that I checked. Why I can't extract the substring of the url link?
Note that forward slash and dash are not regex metacharacters, and so do not need to be escaped, at least not in Presto regex. Consider this version:
select regexp_extract('sssshttps://jira.domain.com/browse/PR-6835hhhh',
'https.*?/browse/\w+-\d+')
This should return:
https://jira.domain.com/browse/PR-6835
Your current regex appears to be coming from some other language, such as JavaScript or PHP, where the regex literal has a delimiter /
, therefore requiring that /
needs to be escaped.
I just edited my answer to use lazy dot .*?
when matching content between https
and the first /browse
. This should fix the edge case mentioned in your comment where .*
is matching across multiple URLs.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.