简体   繁体   中英

Extracting URLs from string Oracle

I have a free form text field on the application for URLs. There is no fixed format so I am struggling to extract the URLs from the text. There can be multiple URLs embedded in the text. How do I extract them individually? Thanks!

Example

<p><span style="font-size:14px"><span style="font-family:calibri"> Final Rule: <u><span style="color:blue"><a href="https://www.google.com/rules/final/2011/33-10345.pdf"><span style="color:blue">Disclosure</span></a></span></u></span></span></p>

<p><span style="font-size:14px"><span style="font-family:calibri">Press Release: <span style="color:blue"><a href="https://www.co.gov/news/press-release"><span style="color:blue">Hedging Policies</span></a></span></span></span></p>

Maybe something like this?

select REGEXP_SUBSTR(col,'a href="(.*?)"',1,level,null,1) as url
   from t
connect by level<=REGEXP_COUNT(col,'a href="(.*?)"')
and prior id = id                 --add these 2 lines to extract from multiple
and prior sys_guid() is not null; --rows in a table where id is the primary key.

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM