简体   繁体   中英

How do I extract sub string up to a specific word?

I am given a list of locations:

location
Smith St
Smith Street 
10 Happy Lily Rd 
52 Happy Lily Road
Abraham St

The following is what I hope to achieve:

location
Smith Street
Happy Lily Road
Abraham Street

May I know how can I do that using regexp? Thank you!!

There was no proper structure or information to retrieve data from the table. Let me assume, you want to retrieve address with fill words.

SELECT location FROM TABLE1 WHERE location LIKE '%Street%' OR location LIKE '%road%' 

Here's your query. I added space on the selection so 'notstreetorRoad' will not be included

with cte(location) AS(
  select 'Smith St' from dual 
  union all
  select 'Smith Street' from dual
  union all
  select '10 Happy Lily Rd ' from dual
  union all
  select '52 Happy Lily Road' from dual
  union all
  select 'notstreetorRoad' from dual
  union all
  select 'good road' from dual
)
select * from cte where  REGEXP_LIKE(lower(location), ' road| street| st| rd')

result:

在此处输入图像描述

As you have a special reqirement here to group strings with some ambigiousity (Rd & Road, St and Street, no numbers), you'll need to equal the strings first to be able to compare them. Afterwards, you can simply use a group by (or distinct) to eliminate doubles.

with testtab AS(
  select 'Smith St' as LOCATIONCOLUMN from dual 
  union all
  select 'Smith Street' from dual
  union all
  select '10 Happy Lily Rd ' from dual
  union all
  select '52 Happy Lily Road' from dual
  union all
  select '52 Happy Lily road' from dual
  union all
  select 'notstreetorRoad' from dual
  union all
  select 'notstreetorRd' from dual
  union all
  select 'good road' from dual
)
select formatedLoc
from (select trim(regexp_replace(regexp_replace(regexp_replace(LOCATIONCOLUMN, 'St *$|Street *$', 'Street', 1, 0, 'Rd *$|Road *$', 'Road', 1, 0, 'i'), '\d', '')) as formatedLoc
       from testtab) formatedTab
group by formatedLoc;

yields

FORMATEDLOC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
-----------------
Happy Lily Road
notstreetorRoad
good road
Smith Street

It replaces St with Street, Rd with Road, removes all numbers and finally trims the result. Then you may easily eliminate doubles.

As this is very specific, you'll have to add other possible ambigious strings yourself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM