[英]extract data from a specific format
I have data in a column which is varchar2(4000) data type in oracle 11g table. 我在oracle 11g表中的数据类型为varchar2(4000)的列中有数据。
The data looks like this, 数据看起来像这样,
"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"
"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"
Can someone please help me how can i do this in oracle sql. 有人可以帮助我如何在oracle sql中做到这一点。
expected output: 预期输出:
LOCT = NY
DISPLAY_TYP = M,F,B
ID = 100
PRICE = >=20
The data needs to be extracted based on LOCT, Price,Display_typ, ID and filter to retrieve the corresponding values. 需要根据LOCT,Price,Display_typ,ID和过滤器提取数据以检索相应的值。
Thanks 谢谢
Here's how you can separate out all the values into individual columns: 您可以按照以下方法将所有值分成几列:
with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"' str from dual union all
select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
regexp_substr(str, '"ID":"([^",]*)"', 1, 1, null, 1) id,
regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
from sample_data;
PKEY LOCT PRICE DISPLAY_TYP ID FILTER
---------- ---- ----- --------------- ----- ------
1 MA 10 M,F,B 101 LTE
2 NY 30 M,F,B 100 GTE
And here's how you could unpivot that out into separate rows: 这是您可以将其分解为单独的行的方法:
with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":"101","FILTER":"LTE"' str from dual union all
select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
str_part||' = '||val sub_str
from (select pkey,
str,
regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
regexp_substr(str, '"ID":"([^",]*)"', 1, 1, null, 1) id,
regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
from sample_data) res
unpivot (val for str_part in (loct, price, display_typ, id, filter));
PKEY SUB_STR
---------- -----------------------------------
1 LOCT = MA
1 PRICE = 10
1 DISPLAY_TYP = M,F,B
1 ID = 101
1 FILTER = LTE
2 LOCT = NY
2 PRICE = 30
2 DISPLAY_TYP = M,F,B
2 ID = 100
2 FILTER = GTE
NB both solutions rely on the fact that "
and ,
won't appear in the value for each of the sub-parts, excepting display_typ (where it's expected that [
and ]
won't appear as part of the value). NB两种解决方案依赖于这样一个事实
"
和,
将不会出现在各个子部分的价值,除了display_typ(其中它的预期, [
和]
将不会出现作为值的一部分)。
For the case where the ID may or may not be enclosed in quotes (I wasn't sure if they would be sometimes), this should work: 对于ID可能包含或不包含在引号中的情况(我不确定有时是否将其括在引号中),这应该可行:
with sample_data as (select 1 pkey, '"LOCT":"MA","PRICE":"10","DISPLAY_TYP": ["M","F","B"],"ID":101,"FILTER":"LTE"' str from dual union all
select 2 pkey, '"LOCT":"NY","DISPLAY_TYP":["M","F","B"],"ID":"100","PRICE":"30","FILTER":"GTE"' str from dual union all
select 3 pkey, '"LOCT":"OH","DISPLAY_TYP":["F","B"],"PRICE":"50","FILTER":"BOO","ID":"102"' str from dual)
-- end of mimicking a table called sample_data containing your strings.
-- You would just run the SQL below, replacing "sample_data" with your tablename
select pkey,
regexp_substr(str, '"LOCT":"([^",]*)"', 1, 1, null, 1) loct,
regexp_substr(str, '"PRICE":"([^",]*)"', 1, 1, null, 1) price,
replace(regexp_substr(str, '"DISPLAY_TYP":\s*\[(.*)\]', 1, 1, null, 1), '"') display_typ,
regexp_substr(str, '"ID":"?([^",]*)"?', 1, 1, null, 1) id,
regexp_substr(str, '"FILTER":"([^",]*)"', 1, 1, null, 1) filter
from sample_data;
PKEY LOCT PRICE DISPLAY_TYP ID FILTER
---------- ---- ----- --------------- ----- ------
1 MA 10 M,F,B 101 LTE
2 NY 30 M,F,B 100 GTE
3 OH 50 F,B 102 BOO
The "?
in the regular expression means a doublequote must appear either 0 or 1 times at that point in the pattern. 正则表达式中的
"?
表示双引号必须在模式中的该点出现0或1次。
If doublequotes will never be present around the ID value, then you could just use '"ID":([^,]*)'
as the pattern. 如果在ID值周围永远不会出现双引号,则可以仅使用
'"ID":([^,]*)'
作为模式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.