Redshift REGEXP_SUBSTR 获取匹配项的最后一次出现

Question

我在使用 listagg 方法获得的按时间 asc 排序的列值中有所有类型的页面事件列表。 listagg(page,';') within group (order by time)

我想获得匹配正则表达式regexp_substr(event_list,'/step[0-9]+[^;]*')的最后一场比赛的出现

根据文档“A positive integer 指示 source_string 中的 position 开始搜索。position 基于字符数，而不是字节数，因此多字节字符被计为单个字符。默认值为 1。如果 position 小于大于 1，则从 source_string 的第一个字符开始搜索。如果 position 大于 source_string 的字符数，则结果为 source_string。

基于此，我需要知道我不知道的确切发生次数。 在这种情况下如何获得最后一场比赛？ 例如： /step1;somethging;somethig;/step2;something;/step3;something;

我想匹配step3。

PS：按时间描述排序并获得第一个匹配项不是这里的选项。

Answer 1

使用regexp_count确定有多少匹配项 ( n ) & 然后使用regexp_substr获取第n个匹配项。

select 
  '/step1;somethging;somethig;/step2;something;/step3;something;' string
, '/step[0-9]+[^;]*' pat
, regexp_count(string, pat) n
, regexp_substr(string, pat, 1, n) last_part

输出：

                                                       string                pat    n    last_part
/step1;somethging;somethig;/step2;something;/step3;something;   /step[0-9]+[^;]*    3       /step3

如果/可以被视为分隔符，那么您也可以采用以下策略

反转字符串，用/ & 分割取第一部分。 再次反转，前缀/并应用正则表达式来提取步骤：

例子：

select 
  '/step1;somethging;somethig;/step2;something;/step3;something;' string
, '/' || reverse(split_part(reverse(string), '/', 1)) last_part
, regexp_substr(last_part, '/step[0-9]+[^;]*') extract_step

输出：

                                                       string           last_part   extract_step 
/step1;somethging;somethig;/step2;something;/step3;something;   /step3;something;         /step3

Redshift REGEXP_SUBSTR 获取匹配项的最后一次出现

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-05-30 21:04:34

Redshift REGEXP_SUBSTR 获取匹配项的最后一次出现

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-05-30 21:04:34

解决方案1
1 已采纳 2020-05-30 21:04:34