在 BigQuery 中的数组中获取 NEXT 值

Question

I want to get the next value in an array delimited by "->"我想获取由“->”分隔的数组中的下一个值

For example, in this example below, i want to get the next value after "Res" in each例如，在下面的这个例子中，我想在每个“Res”之后获取下一个值

ROW 1- "Q -> Res -> tes -> Res -> twet"
ROW 2- "rw -> gewg -> tes -> Res -> twet"
ROW 3- "Y -> Res -> Res -> Res -> twet"

Output would be: Output 将是：

ROW 1- tes
ROW 2- tewt
ROW 3- tewt

Ive tried the following but it gets me the previous value,我尝试了以下方法，但它让我得到了以前的值，

Array_reverse(split(regexp_extract(COLUMN_NAME, '(.*?)Res'), '->'))[safe_offset(1)]

Answer 1

Consider below approach考虑以下方法

select id, 
  ( select word from unnest(arr) word with offset
    where offset > (select offset from unnest(arr) word with offset where trim(word) = 'Res' limit 1)
    and trim(word) != 'Res' order by offset limit 1
  ) as next_word
from your_table, unnest([struct(split(path, '->') as arr)])

if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是

Another option is另一种选择是

select id, 
  ( select split(pair, ' -> ')[offset(1)]
    from unnest(arr) pair with offset
    where trim(pair) != 'Res -> Res'
    order by offset limit 1
  ) as next_word
from your_table, unnest([struct(regexp_extract_all(path, r' Res -> \w+') as arr)])

with same output同 output

The benefit of later solution is that it can easily be adjusted to catch all instances (in one row) of words placed after 'Res' - like in below example稍后解决方案的好处是可以轻松调整它以捕获放置在“Res”之后的所有单词实例（在一行中） - 如下例所示

select id, 
  array( select split(pair, ' -> ')[offset(1)]
    from unnest(arr) pair with offset
    where trim(pair) != 'Res -> Res'
    order by offset 
  ) as next_words
from your_table, unnest([struct(regexp_extract_all(path, r' Res -> \w+') as arr)])

with output与 output

Answer 2

Consider the approach below using distinct to remove the duplicates and will retain the 1st occurence:考虑以下使用 distinct 删除重复项并保留第一次出现的方法：

with sample_data as (
          select 'ROW 1'as id, "Q -> Res -> tes -> Res -> twet" as test_str
union all select 'ROW 2'as id ,"rw -> gewg -> tes -> Res -> twet" as test_str
union all select 'ROW 3'as id, "Y -> Res -> Res -> Res -> twet" as test_str
),
remove_dups as (
select 
  id,
  array_to_string(array(select distinct * from unnest(split(test_str,' -> '))),',') as clean_str
from sample_data
)
select 
  id,
  regexp_extract(clean_str,r',?Res\,(\w+)\,?') as next_word
from remove_dups

Output: Output：

在 BigQuery 中的数组中获取 NEXT 值

问题描述

2 个解决方案

解决方案1
0 2022-09-09 17:19:56

解决方案2
0 2022-09-11 22:18:03

在 BigQuery 中的数组中获取 NEXT 值

问题描述

2 个解决方案

解决方案1 0 2022-09-09 17:19:56

解决方案2 0 2022-09-11 22:18:03

解决方案1
0 2022-09-09 17:19:56

解决方案2
0 2022-09-11 22:18:03