简体   繁体   中英

Using REGEXP_SUBSTR to get key-value pair data

I have a column with below values,

User_Id=446^User_Input=L307-60#/25" AP^^

I am trying to get each individual value based on a specified key.

  1. All value after User_Id= until it encounters ^
  2. All value after User_Input= until it encounters ^

I tried for and so far I have this,

SELECT  LTRIM(REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^'
          ,'[0-9]+',1,1),'^') User_Id 
from dual

How do I get the value for the User_Input??

PS: User input can have anything, like ',", *,% including a ^ in the middle of the string (that is, not as a delimiter).

Any help would be greatly appreciated..

If there is no particular need to use Regex, something like this returns the value.

WITH rslt AS (
SELECT 'User_Id=446^User_Input=L307-60#/25" AP^' val 
  FROM dual
)
SELECT LTRIM(SUBSTR(val
                   ,INSTR(val, '=', 1, 2) + 1
                   ,INSTR(val, '^', 1, 2) - (INSTR(val, '=', 1, 2) + 1)))
  FROM rslt;

Of course, if you can't guarantee that there will not be any carets that are valid text characters, this will possibly return partial results.

This can be easily solved using boring old INSTR to calculate the offsets of the start and end points for the KEY and VALUE strings.

The trick is to use the optional occurrence parameter to identify each the correct instance of = . Because the input can contain carets which aren't intended as delimiters we need to use a negative position to identify the last ^ .

with cte as  (
  select kv
         , instr(kv, '=', 1, 1)+1 as k_st  -- first occurrence 
         , instr(kv, '^', 1) as k_end
         , instr(kv, '=', 1, 2)+1 as v_st  -- second occurrence 
         , instr(kv, '^', -1) as v_end     -- counting from back
  from t23
  )
select substr(kv, k_st, k_end - k_st) as user_id
       , substr(kv, v_st, v_end - v_st) as user_input
from cte
/

Here is the requisite SQL Fiddle to prove it works . I think it's much easier to understand than any regex equivalent.

Assuming that you will always have 'User_Id=' and 'User_Input=' in your string, I would use a character group approach to parsing

Use the starting anchor, ^ , and ending anchor, $ . Look for 'User_Id=' and 'User_Input='

Associate the value you are searching for with a character group.

    SCOTT@dev> 
  1  SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 1) User_Id
  2* FROM dual
SCOTT@dev> /

USER
====
446^


SCOTT@dev> 
  1  SELECT REGEXP_SUBSTR('User_Id=446^User_Input=L307-60#/25" AP^','^User_Id=(.*\^)User_Input=(.*\^)$',1, 1, NULL, 2) User_Input
  2* FROM dual
SCOTT@dev> /

USER_INPUT
================
L307-60#/25" AP^

SCOTT@dev> 

Got this answer from a friend of mine.. Looks simple and works great...

SELECT
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Id=([^\^]+).*', '\1') User_Id,
regexp_replace('User_Id=446^User_Input=L307-60#/25" AP^^', '.*User_Input=(.*)[\^]$', '\1') User_Input
FROM dual

Posting here just in case any of you find it interesting..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM