[英]How to convert a hive query with regex to oracle
I have this text:我有这段文字:
Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples
I just want to get the part after 'Process explanation' but not include 'final activity...'我只想得到“过程解释”之后的部分,但不包括“最终活动……”
So like this:所以像这样:
The bottle is then melted to form liquid glass.
This is the current hive query which I want to convert to oracle:这是我要转换为 oracle 的当前 hive 查询:
SELECT REGEXP_EXTRACT(
'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples',
'.*(process[ \t]*(explanation)?[ \t]*:[ \t]*)(.*?)([ \t]*;[ \t]*final[ \t]+activity[ \t]+for[ \t]+manager.*$|$)',
3) as extracted
FROM my_table
If those substrings are just like you said, there's a pretty simple option - substr
+ instr
functions.如果这些子字符串就像您所说的那样,那么有一个非常简单的选项 -
substr
+ instr
函数。
SQL> with test (col) as
2 (select 'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' from dual)
3 select substr(col, instr(col, 'Process explanation') + length('Process explanation') + 1,
4 instr(col, 'Final activity') - instr(col, 'Process explanation') -
5 length('Process explanation') - 2
6 ) result
7 from test;
RESULT
----------------------------------------------
The bottle is then melted to form liquid glass
SQL>
I've come up with something like this:我想出了这样的事情:
with strings as
(SELECT '1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' str FROM DUAL
union all
SELECT '2Process explanation:The bottle is then melted to form liquid glass;' str FROM DUAL
union all
SELECT '3Process :The bottle is then melted to form liquid glass' str FROM DUAL
union all
SELECT '4Process explanation: plasma gasification combined with centrifugal activity' str FROM DUAL
union all
SELECT '5Final activity for manager:Labeling of previous samples' str FROM DUAL
)
SELECT str
, REGEXP_SUBSTR(
str,
'(.*process[[:blank:]]*(explanation)?[[:blank:]]*:[[:blank:]]*)([A-Za-z0-9 ]*)([[:blank:]]*;[[:blank:]]*final[[:blank:]]*activity[[:blank:]]*for[[:blank:]]*manager.*$)?',
1, 1, 'i',3)
as extracted
FROM strings
Resulting in:导致:
STR![]() |
EXTRACTED![]() |
---|---|
1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples ![]() |
The bottle is then melted to form liquid glass![]() |
2Process explanation:The bottle is then melted to form liquid glass; ![]() |
The bottle is then melted to form liquid glass![]() |
3Process:The bottle is then melted to form liquid glass ![]() |
The bottle is then melted to form liquid glass![]() |
4Process explanation: plasma gasification combined with centrifugal activity ![]() |
plasma gasification combined with centrifugal activity![]() |
5Final activity for manager:Labeling of previous samples ![]() |
- ![]() |
Assuming matching blank group instead of your space and tab list [ \t] is ok.假设匹配空白组而不是您的空格和制表符列表 [ \t] 是可以的。 Edit: Modified the regexp a bit cause with possibility of last group being empty '.*' kept catching entire line.
编辑:修改了正则表达式,因为最后一组可能为空 '.*' 一直捕获整行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.