简体   繁体   中英

What is the best way to separate and save addresses from a string in PL/SQL?

In our database we have tables with company addresses. Some companies have more than 1 address. In this case we used to save addresses in a column like that:

'119021, г Москва, ул Льва Толстого, д 16; 117312, г Москва, ул Вавилова, д 19' 

So there are a semicolon and space between addresses.

Now I have to create different table where every address has its own row by copying data from existing table.

What instruments should I use in my procedure that

  • determine whether there are one or more than 1 addresses in a column
  • separate addresses if there are more than 1

Should I use REGEXP? If yes, how?

determine whether there are one or more than 1 addresses in a column

You could use REGEXP_COUNT to count how many times the semi-colon character appears in the field : adding 1 to that to gives you the count of addresses :

SELECT t.*, regexp_count(t.address, ';') + 1 FROM mytable t;

separate addresses if there are more than 1

To split the string into new rows, as commented by O.Jones, an approach is to use regex functions in combination with a recursive query, like :

SELECT TRIM(regexp_substr( t.address, '[^;]+', 1,  LEVEL )) regexp_substr
FROM mytable t
CONNECT BY LEVEL <= regexp_count( t.address,  ';' ) + 1;

This type of statement can be embedded into an INSERT ... AS SELECT or CREATE ... AS SELECT order, for example :

CREATE TABLE newtable as
SELECT t.name, TRIM(regexp_substr( t.address, '[^;]+', 1,  LEVEL )) regexp_substr
FROM mytable t
CONNECT BY LEVEL <= regexp_count( t.address,  ';' ) + 1;

Demo on Oracle 11gR2

Firstly I would be remiss if I did not point out for future searchers that storing more than one piece of information in a single column violates first normal form of database design and will give you nothing but trouble down the road. Since you are remedying this I salute you!

Secondly using the regex pattern of the form '[^;]+' to parse delimited strings fails when there are null elements in the list. See here for more info on why that is a bad idea: https://stackoverflow.com/a/31464699/2543416

Now, I'm not all doom and gloom! I don't know what characterset you are using but you'll get the idea. This example uses a regex form that allows for NULL list elements and 1 or more semi-colons followed by a space OR the end of the line.

with tbl(id, str) as (
  select 1, '119021-1, г Москва, ул Льва Толстого, д 16; 117312-1, г Москва, ул Вавилова, д 19' from dual union all
  select 2, '119021-2, г Москва, ул Льва Толстого, д 16;;; 117312-2, г Москва, ул Вавилова, д 19' from dual
)
select id, regexp_substr(str, '(.*?)(;+ |$)', 1, level, NULL, 1) addr_fixed
from tbl
connect by regexp_substr(str, '(.*?)(;+ |$)', 1, level, NULL, 1) is not null
  and prior id = id
  and prior sys_guid() is not null;


        ID ADDR_FIXED                                   
---------- ---------------------------------------------
         1 119021-1, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿ ¿¿¿¿¿¿¿¿, ¿ 16   
         1 117312-1, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿¿¿¿¿, ¿ 19        
         2 119021-2, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿ ¿¿¿¿¿¿¿¿, ¿ 16   
         2 117312-2, ¿ ¿¿¿¿¿¿, ¿¿ ¿¿¿¿¿¿¿¿, ¿ 19        

4 rows selected.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM