I have a table:
CREATE TABLE "text_file"
( "SEQ" NUMBER,
"SPLIT_VALUE" CLOB
)
The content of the table is:
SEQ SPLIT_VALUE
1 MSH|^~\&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105
2 MSH|^~\&|GHH LAB|ELAB-3|GHH OE|BLDG4|200202150930||ORU^R01
PID|||555-44-4444||EVERYWOMAN^EVE^E^^^^L|JONES|19620320|F|||153 FERNWOOD DR.^^STATESVILLE^OH^35292|
OBR|1|845439^GHH OE|1045813^GHH LAB|15545^GLUCOSE|||200202150730
OBX|1|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^182|mg/dl|70_105
OBX|2|SN|1554-5^GLUCOSE^POST 12H CFST:MCNC:PT:SER/PLAS:QN||^172|mg/dl|70_105
Please note - the possible segment like MSH, OBR, OBX, LX can be 3 character or 2 characters. So, best way would be to get the segment name before the first pipe.
I am looking to split the string in split_value into multiple rows in the following conditions:
^
, then it would break down even further, for ex. MSH08-01, MSH08-02 Please note - there is an exception for segment MSH. For MSH - first element is
|
and second one is^~\\&
SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~\&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3
Please note - I have around 90,000 rows in text_file table. So the solution should be able to process 90,000 efficiently.
The complete output is:
SEQ SPLIT_SEQ SEG_SEQ SPLIT_SEQ_VALUE
1 MSH00 1 MSH
1 MSH01 1 |
1 MSH02 1 ^~\&
1 MSH03 1 GHH LAB
1 MSH04 1 ELAB-3
1 MSH05 1 GHH OE
1 MSH06 1 BLDG4
1 MSH07 1 200202150930
1 MSH08 1
1 MSH09-01 1 ORU
1 MSH09-02 1 R01
1 PID00 1 PID
1 PID01 1
1 PID02 1
1 PID03 1 555-44-4444
1 PID04 1
1 PID05-01 1 EVERYWOMAN
1 PID05-02 1 EVE
1 PID05-03 1 E
1 PID05-04 1
1 PID05-05 1
1 PID05-06 1
1 PID05-07 1 L
1 PID06 1 JONES
1 PID07 1 19620320
1 PID08 1 F
1 PID09 1
1 PID10 1
1 PID11-01 1 153 FERNWOOD DR.
1 PID11-02 1
1 PID11-03 1 STATESVILLE
1 PID11-04 1 OH
1 PID11-05 1 35292
1 PID12 1
1 OBR00 1 OBR
1 OBR01 1 1
1 OBR02-01 1 845439
1 OBR02-02 1 GHH OE
1 OBR03-01 1 1045813
1 OBR03-02 1 GHH LAB
1 OBR04-01 1 15545
1 OBR04-02 1 GLUCOSE
1 OBR05 1
1 OBR06 1
1 OBR07 1 200202150730
1 OBX00 1 OBX
1 OBX01 1 1
1 OBX02 1 SN
1 OBX03-01 1 1554-5
1 OBX03-02 1 GLUCOSE
1 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 1
1 OBX05-01 1
1 OBX05-02 1 182
1 OBX06 1 mg/dl
1 OBX07 1 70_105
1 OBX00 2 OBX
1 OBX01 2 1
1 OBX02 2 SN
1 OBX03-01 2 1554-5
1 OBX03-02 2 GLUCOSE
1 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
1 OBX04 2
1 OBX05-01 2
1 OBX05-02 2 182
1 OBX06 2 mg/dl
1 OBX07 2 70_105
2 MSH00 1 MSH
2 MSH01 1 |
2 MSH02 1 ^~\&
2 MSH03 1 GHH LAB
2 MSH04 1 ELAB-3
2 MSH05 1 GHH OE
2 MSH06 1 BLDG4
2 MSH07 1 200202150930
2 MSH08 1
2 MSH09-01 1 ORU
2 MSH09-02 1 R01
2 PID00 1 PID
2 PID01 1
2 PID02 1
2 PID03 1 555-44-4444
2 PID04 1
2 PID05-01 1 EVERYWOMAN
2 PID05-02 1 EVE
2 PID05-03 1 E
2 PID05-04 1
2 PID05-05 1
2 PID05-06 1
2 PID05-07 1 L
2 PID06 1 JONES
2 PID07 1 19620320
2 PID08 1 F
2 PID09 1
2 PID10 1
2 PID11-01 1 153 FERNWOOD DR.
2 PID11-02 1
2 PID11-03 1 STATESVILLE
2 PID11-04 1 OH
2 PID11-05 1 35292
2 PID12 1
2 OBR00 1 OBR
2 OBR01 1 1
2 OBR02-01 1 845439
2 OBR02-02 1 GHH OE
2 OBR03-01 1 1045813
2 OBR03-02 1 GHH LAB
2 OBR04-01 1 15545
2 OBR04-02 1 GLUCOSE
2 OBR05 1
2 OBR06 1
2 OBR07 1 200202150730
2 OBX00 1 OBX
2 OBX01 1 1
2 OBX02 1 SN
2 OBX03-01 1 1554-5
2 OBX03-02 1 GLUCOSE
2 OBX03-03 1 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 1
2 OBX05-01 1
2 OBX05-02 1 182
2 OBX06 1 mg/dl
2 OBX07 1 70_105
2 OBX00 2 OBX
2 OBX01 2 1
2 OBX02 2 SN
2 OBX03-01 2 1554-5
2 OBX03-02 2 GLUCOSE
2 OBX03-03 2 POST 12H CFST:MCNC:PT:SER/PLAS:QN
2 OBX04 2
2 OBX05-01 2
2 OBX05-02 2 182
2 OBX06 2 mg/dl
2 OBX07 2 70_105
I believe that in as plsql pipelined function would be the best way.
Any help would be appreciated.
It is PL/SQL and assuming your string can be of arbitrary length as well (ie more than 32K); you should use a table function along with dbms_lob package to parse it and then return multiple rows.
Blob Journey from Web to DB is a general article that shows how to manipulate blobs from web point of view. But approach there is the same. See the section around [Selecting Data]. This is simply splitting at 4000 bytes but your split logic will have to take into account the |. Idea is same though.
Then later on see the [table] usage along with PL/SQL
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.