简体   繁体   English

如何在 Postgres 中为文本定义正则表达式

[英]How to define regexp for text in Postgres

Please help to define Postgres regexp for this case:请帮助为这种情况定义 Postgres 正则表达式:

I have string field:我有字符串字段:

union all select 'AbC-345776-2345' /*comment*/ union all select 'Fgr-sdf344-111a' /*BN34*/ some text union all select 'sss-sdf34-123' /*some text*/ some text

Here is the same text in select statement for convinience:为了方便起见,这是 select 语句中的相同文本:

select 'union all select ''AbC-345776-2345'' /*comment*/ union all select ''Fgr-sdf344-111a'' /*BN34*/ some text union all select ''sss-sdf34-123'' /*some text*/ some text' as str

I need to get from this mess text only values in '...' and select it into separated rows like this:我需要从这个混乱的文本中只获取 '...' 和 select 中的值,并将其分成如下的单独行:

AbC-345776-2345
Fgr-sdf344-111a
sss-sdf34-123

Pattern: 'first 2-3 letters - several letters and numbers - several letters and numbers'模式:'前 2-3 个字母 - 几个字母和数字 - 几个字母和数字'

I created this select but it contains all comments and "sometext" as well:我创建了这个 select 但它也包含所有评论和“sometext”:

select regexp_split_to_table(trim(replace(replace(replace(replace(t1.str,'union all select',''),'from DUAL',''),chr(10),''),'''','') ), E'\\s+')
from (select 'union all select ''AbC-345776-2345'' /*comment*/ union all select ''Fgr-sdf344-111a'' /*BN34*/ some text union all select ''sss-sdf34-123'' /*some text*/ some text' as str) t1; 

The following should do it:以下应该做到这一点:

select (regexp_matches(str, $$'([a-zA-Z]{2,3}-[a-zA-Z0-9]+-[a-zA-Z0-9]+)'$$, 'g'))[1]
from the_table;

Given your sample data it returns:给定您返回的样本数据:

regexp_matches 
---------------
AbC-345776-2345
Fgr-sdf344-111a
sss-sdf34-123  

The regex checks for the pattern you specified inside single quotes.正则表达式检查您在单引号内指定的模式。 By using a group (...) I excluded the single quotes from the result.通过使用组(...) ,我从结果中排除了单引号。

regexp_matches() returns one row for each match, containing an array of matches. regexp_matches()为每个匹配返回一行,包含一个匹配数组。 But as the regex only contains a single group, the first element of the array is what we are interested in.但是由于正则表达式只包含一个组,所以数组的第一个元素是我们感兴趣的。

I used dollar quoting to avoid escaping the single quotes in the regex我使用美元引用来避免 escaping 正则表达式中的单引号

Online example 在线示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM