In our database we have tables with company addresses. Some companies have more than 1 address. In this case we used to save addresses in a column like that:
'119021, г Москва, ул Льва Толстого, д 16; 117312, г Москва, ул Вавилова, д 19'
So there are a semicolon and space between addresses.
Now I have to create different table where every address has its own row by copying data from existing table.
What instruments should I use in my procedure that
Should I use REGEXP? If yes, how?
determine whether there are one or more than 1 addresses in a column
You could use REGEXP_COUNT
to count how many times the semi-colon character appears in the field : adding 1 to that to gives you the count of addresses :
SELECT t.*, regexp_count(t.address, ';') + 1 FROM mytable t;
separate addresses if there are more than 1
To split the string into new rows, as commented by O.Jones, an approach is to use regex functions in combination with a recursive query, like :
SELECT TRIM(regexp_substr( t.address, '[^;]+', 1, LEVEL )) regexp_substr
FROM mytable t
CONNECT BY LEVEL <= regexp_count( t.address, ';' ) + 1;
This type of statement can be embedded into an INSERT ... AS SELECT
or CREATE ... AS SELECT
order, for example :
CREATE TABLE newtable as
SELECT t.name, TRIM(regexp_substr( t.address, '[^;]+', 1, LEVEL )) regexp_substr
FROM mytable t
CONNECT BY LEVEL <= regexp_count( t.address, ';' ) + 1;
Firstly I would be remiss if I did not point out for future searchers that storing more than one piece of information in a single column violates first normal form of database design and will give you nothing but trouble down the road. Since you are remedying this I salute you!
Secondly using the regex pattern of the form '[^;]+'
to parse delimited strings fails when there are null elements in the list. See here for more info on why that is a bad idea: https://stackoverflow.com/a/31464699/2543416
Now, I'm not all doom and gloom! I don't know what characterset you are using but you'll get the idea. This example uses a regex form that allows for NULL list elements and 1 or more semi-colons followed by a space OR the end of the line.
with tbl(id, str) as (
select 1, '119021-1, г Москва, ул Льва Толстого, д 16; 117312-1, г Москва, ул Вавилова, д 19' from dual union all
select 2, '119021-2, г Москва, ул Льва Толстого, д 16;;; 117312-2, г Москва, ул Вавилова, д 19' from dual
)
select id, regexp_substr(str, '(.*?)(;+ |$)', 1, level, NULL, 1) addr_fixed
from tbl
connect by regexp_substr(str, '(.*?)(;+ |$)', 1, level, NULL, 1) is not null
and prior id = id
and prior sys_guid() is not null;
ID ADDR_FIXED
---------- ---------------------------------------------
1 119021-1, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿ ¿¿¿¿¿¿¿¿, ¿ 16
1 117312-1, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿¿¿¿¿, ¿ 19
2 119021-2, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿ ¿¿¿¿¿¿¿¿, ¿ 16
2 117312-2, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿¿¿¿¿, ¿ 19
4 rows selected.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.