简体   繁体   English

从特定格式提取数据

[英]extract data from a specific format

I have data in a column which is varchar2(4000) data type in oracle 11g table. 我在oracle 11g表中的数据类型为varchar2(4000)的列中有数据。

The data looks like this, 数据看起来像这样,

"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE" 

"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"  

Can someone please help me how can i do this in oracle sql. 有人可以帮助我如何在oracle sql中做到这一点。

expected output: 预期输出:

LOCT = NY 

DISPLAY_TYP = M,F,B   

ID = 100  

PRICE = >=20  

The data needs to be extracted based on LOCT, Price,Display_typ, ID and filter to retrieve the corresponding values. 需要根据LOCT,Price,Display_typ,ID和过滤器提取数据以检索相应的值。

Thanks 谢谢

Here's how you can separate out all the values into individual columns: 您可以按照以下方法将所有值分成几列:

with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"' str from dual union all
                     select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
       regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
       regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
       replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
       regexp_substr(str, '"ID":"([^",]*)"', 1, 1, null, 1) id,
       regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
from   sample_data;

      PKEY LOCT PRICE DISPLAY_TYP     ID    FILTER
---------- ---- ----- --------------- ----- ------
         1 MA   10    M,F,B           101   LTE   
         2 NY   30    M,F,B           100   GTE   

And here's how you could unpivot that out into separate rows: 这是您可以将其分解为单独的行的方法:

with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"' str from dual union all
                     select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
       str_part||' = '||val sub_str
from   (select pkey,
               str,
               regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
               regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
               replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
               regexp_substr(str, '"ID":"([^",]*)"', 1, 1, null, 1) id,
               regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
        from   sample_data) res
unpivot (val for str_part in (loct, price, display_typ, id, filter));

      PKEY SUB_STR                            
---------- -----------------------------------
         1 LOCT = MA                          
         1 PRICE = 10                         
         1 DISPLAY_TYP = M,F,B                
         1 ID = 101                           
         1 FILTER = LTE                       
         2 LOCT = NY                          
         2 PRICE = 30                         
         2 DISPLAY_TYP = M,F,B                
         2 ID = 100                           
         2 FILTER = GTE      

NB both solutions rely on the fact that " and , won't appear in the value for each of the sub-parts, excepting display_typ (where it's expected that [ and ] won't appear as part of the value). NB两种解决方案依赖于这样一个事实",将不会出现在各个子部分的价值,除了display_typ(其中它的预期, []将不会出现作为值的一部分)。


For the case where the ID may or may not be enclosed in quotes (I wasn't sure if they would be sometimes), this should work: 对于ID可能包含或不包含在引号中的情况(我不确定有时是否将其括在引号中),这应该可行:

with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":101,"FILTER":"LTE"' str from dual union all
                     select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual union all
                     select 3 pkey, '"LOCT":"OH","DISPLAY_TYP":["F","B"],"PRICE":"50","FILTER":"BOO","ID":"102"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
       regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
       regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
       replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
       regexp_substr(str, '"ID":"?([^",]*)"?', 1, 1, null, 1) id,
       regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
from   sample_data;

      PKEY LOCT PRICE DISPLAY_TYP     ID    FILTER
---------- ---- ----- --------------- ----- ------
         1 MA   10    M,F,B           101   LTE   
         2 NY   30    M,F,B           100   GTE   
         3 OH   50    F,B             102   BOO   

The "? in the regular expression means a doublequote must appear either 0 or 1 times at that point in the pattern. 正则表达式中的"?表示双引号必须在模式中的该点出现0或1次。

If doublequotes will never be present around the ID value, then you could just use '"ID":([^,]*)' as the pattern. 如果在ID值周围永远不会出现双引号,则可以仅使用'"ID":([^,]*)'作为模式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM