简体   繁体   English

使用另一个字段 BigQuery 在带有分隔符的字段中获取字符串的 Position

[英]Get Position of a String in a field with delimiters using ANOTHER field BigQuery

I want to get the position of a word in a field that has the following data with the delimiter as "->":我想在一个字段中获取一个单词的 position,该字段具有以下数据,分隔符为“->”:

Example Col1 : *示例Col1 *

Row 1|第 1 行| "ACT -> BAT -> CAT -> DATE -> EAT" “行动 -> 蝙蝠 -> 猫 -> 日期 -> 吃”

Row 2|第 2 行| "CAT -> ACT -> EAT -> BAT -> DATE" “猫 -> 行动 -> 吃 -> BAT -> 日期”

I would like to lets extract the position of a value which is is ANOTHER COLUMN.我想提取 position 的值是另一列。

Example Col2 :示例Col2

Row 1|第 1 行| CAT

Row 2|第 2 行| ACT行为

Output would be - Output 将是 -

Row 1|第 1 行| 3 3

Row 2|第 2 行| 2 2

Ive tried regex_instr and instr but they both return position of the alphabet i think not the word.我试过 regex_instr 和 instr 但他们都返回 position 我认为不是这个词。

Also tried this but it doesnt work:也试过这个,但它不起作用:

select *, array_length(split(regexp_extract( col1 , col2 ), '->')) select *, array_length(split(regexp_extract( col1 , col2 ), '->'))

How about this:这个怎么样:

select col1_item, col2, (case when trim(col1_item) = trim(col2) then col2_index else null end) as col2_index_found
from (select col1_item, col2, col2_index
from 
(
  select split("ACT->BAT->CAT->DATE->EAT", "->")as col1, 'CAT' as col2  
union all 
  select split("CAT->ACT->EAT->BAT->DATE", "->")as col1, 'ACT' as col2 

), unnest(col1) as col1_item WITH OFFSET AS col2_index 
)

This will give what you want.这会给你想要的。 Just one note: this offset is zero based index of array.请注意:此偏移量是从零开始的数组索引。

Consider approach below using arrays:考虑以下使用 arrays 的方法:

with sample_data as (
  select "ACT->BAT->CAT->DATE->EAT" as col1, "CAT" as col2
  union all select "CAT->ACT->EAT->BAT->DATE" as col1, "ACT" as col2

),
split_col1 as (
select 
  split(col1, "->") as col1_arr,
  col2,
from sample_data
)
select  
  if(col2 = col1_arr[offset(index)], index+1, null) as col2_index
from split_col1,
  unnest(generate_array(0,array_length(col1_arr)-1)) as index
where col2 = col1_arr[offset(index)]

Output: Output:

在此处输入图像描述

Consider below approach考虑以下方法

select *, 
  array_length(split(regexp_extract(col1, r'(.*?)' || col2), '->')) as position
from your_table             

if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM