I am using an Oracle database and I'm trying to select values between parentheses. Here is my table, which has ID
and Roads
as columns. I have read-only access to this database so I can only use SELECT
:
ID Roads
-- -----
1 #Chaussée de Waterloo (Ixelles)#
2 #Rue Reper-Vreven (Bruxelles)#
3 #Rue des Fraises (Anderlecht)#
4 #Chaussée de Roodebeek (Woluwe-Saint-Lambert)#
5 #Square Jean Absil (Etterbeek)#Avenue Hansen-Soulie (Etterbeek)#Avenue Le Marinel (Etterbeek)#
Basically, from the Roads
column, I only want to keep the values between parentheses. As the final query has other tables in it, I want a select distinct. The desired output is:
ID Roads
------------------
1 Ixelles
2 Bruxelles
3 Anderlecht
4 Woluwe-Saint-Lambert
5 Etterbeek, Etterbeek, Etterbeek
I tried the following query, which works fine when there is only one set of parentheses, but this doesn't work when there are several (like for ID 5), as it only gives back the values in the first set of parentheses:
select distinct substr(roads, instr(roads,'(') + 1, instr(roads,')') - instr(roads,'(') - 1) as roads
from table
Does anyone know where I'm going wrong?
One solution - based in part on the one referenced by przemo_pl:
SELECT SUBSTR( with_parentheses,2,length(with_parentheses)-2) between_parenthesis
FROM
(select REGEXP_SUBSTR(dat, '\([^()]*\)+',1,level) AS with_parentheses
from (select '#Square Jean Absil (Etterbeek)#Avenue Hansen-Soulie (Anderlecht)#Avenue Le Marinel (Ixelles)#' as dat from dual )
connect by LEVEL <= ( LENGTH(dat) - LENGTH(REPLACE(dat, '(', '')))
)
returns:
between_parenthesis
---------------------
"Etterbeek"
"Anderlecht"
"Ixelles"
If you want this re-assembled into a single row then that adds another wrinkle.
Please see my other post for a much simpler answer. I am leaving this here as it is still an interesting approach and a lesson on how over-thinking a solution can get way too complicated and sometimes one must just step back and approach a problem a different way. :-)
Ok, you need to loop through the rows and through the parentheses within the rows:
with tbl(ID, Roads) as (
select 1, '#Chaussée de Waterloo (Ixelles)#' from dual
union
select 2, '#Rue Reper-Vreven (Bruxelles)#' from dual
union
select 3, '#Rue des Fraises (Anderlecht)#' from dual
union
select 4, '#Chaussée de Roodebeek (Woluwe-Saint-Lambert)#' from dual
union
select 5, '#Square Jean Absil (Etterbeek)#Avenue Hansen-Soulie (Etterbeek)#Avenue Le Marinel (Etterbeek)#' from dual
)
SELECT ID, Roads,
COLUMN_VALUE AS match_nbr,
REGEXP_SUBSTR( Roads ,'\(([^\)]*)\)', 1, COLUMN_VALUE, NULL, 1 ) AS match_value
FROM tbl,
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( Roads ,'\(' )
) AS SYS.ODCINUMBERLIST
)
);
Result:
See here for a similar post, which links to another post which provides more info. I don't claim to understand it fully. :-)
EDIT: Updated to get list of roads on one line using listagg( ):
SQL> with tbl(ID, Roads) as (
select 1, '#Chaussée de Waterloo (Ixelles)#' from dual
union
select 2, '#Rue Reper-Vreven (Bruxelles)#' from dual
union
select 3, '#Rue des Fraises (Anderlecht)#' from dual
union
select 4, '#Chaussée de Roodebeek (Woluwe-Saint-Lambert)#' from dual
union
select 5, '#Square Jean Absil (Etterbeek)#Avenue Hansen-Soulie (Etterbeek)#Avenue Le Marinel (Etterbeek)#' from dual
)
select id,
listagg(match_value, ', ') within group (order by id) road_list
from (
SELECT ID, Roads, COLUMN_VALUE AS match_nbr,
REGEXP_SUBSTR( Roads ,'\(([^\)]*)\)', 1, COLUMN_VALUE, NULL, 1 ) AS match_value
FROM tbl,
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT( Roads ,'\(' )
) AS SYS.ODCINUMBERLIST
)
)
)
group by id
order by id
;
ID ROAD_LIST
---------- --------------------------------------------------
1 Ixelles
2 Bruxelles
3 Anderlecht
4 Woluwe-Saint-Lambert
5 Etterbeek, Etterbeek, Etterbeek
SQL>
I am adding this as a new answer as it is so different from my previous one, which was a classic example of over-complicated thinking getting worse with each iteration and thus is still a good example of that! lol Sometimes you just need to feel when you are getting too complicated and don't be afraid to start over on a different tack!
Ok, check this out. I went back to square one and studied the string for a pattern. Whether one road or more (actually this design of multiple values in one column violates basic data modeling tenets and should be reworked but who hasn't had to deal with a crappy design that we have no control over?), each road is surrounded by pound signs. My thought was to loop through the string using a regular expression, replacing the pattern of a pound sign through the closing parentheses with what is inside the parentheses. Granted this leaves a pound sign at the end but we'll clean that up later. Note that REGEXP_REPLACE will replace all occurrences of the pattern if found so by default loops through all roads and is WAY easier to maintain then a fugly mass of nested INSTR(), SUBSTR() :
SQL> with tbl(ID, Roads) as (
select 1, '#Chaussée de Waterloo (Ixelles)#' from dual
union
select 2, '#Rue Reper-Vreven (Bruxelles)#' from dual
union
select 3, '#Rue des Fraises (Anderlecht)#' from dual
union
select 4, '#Chaussée de Roodebeek (Woluwe-Saint-Lambert)#' from dual
union
select 5, '#Square Jean Absil (Etterbeek)#Avenue Hansen-Soulie (Etterbeek)#Avenue Le Marinel (Etterbeek)#' from dual
)
select ID, rtrim(regexp_replace(Roads, '#.+?\((.+)\)', '\1, '), ', #') Roads
from tbl;
ID ROADS
---------- ----------------------------------------
1 Ixelles
2 Bruxelles
3 Anderlecht
4 Woluwe-Saint-Lambert
5 Etterbeek, Etterbeek, Etterbeek
SQL>
The regular expression pattern explained:
# Look for a literal pound sign
. followed by any character
+ followed by one or more of the previous character (any character)
? make the previous character optional (one or more any characters)
\( a literal left paren
( start remembered group 1
. any character
+ one or more "any" characters
) end remembered group 1
\) followed by a literal closing right paren
If the above string is found, replace with the "replace-with" string:
\1 The first remembered group which is what is inside the parentheses.
,<space> followed by a comma and a space
Then its a tad quick and dirty, but just use RTRIM
to remove the trailing comma-space-pound sign. Wha-la! Whew.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.