简体   繁体   English

如何从以数组中列出的子字符串结尾的字符串中提取单词? 大查询

[英]How to extract words from a string that end with substrings listed in an array? BigQuery

I have a table of rows with cells containing multiple strings.我有一个包含多个字符串的单元格的行表。 Like this:像这样:

 K1111=V1111;K1=V1;kv13_key4=--xxxxxsomething;id5=true;impid=23123123;location=domain_co_uk

I need to extract a substring that begins with kv13_key4= and ends with anything after but the lengths all vary and the substrings are all separated by a semicolon;我需要提取一个 substring,它以 kv13_key4= 开头,以任何内容结尾,但长度各不相同,子字符串都用分号分隔; . . I tried我试过了

REGEXP_EXTRACT(customtargeting,'%in2w_key4%;') As contains_key_Value

but didn't work.但没有用。 I need something like this:我需要这样的东西:

| Original Cell                                                                                            | Extracted                      |
| key88=1811111;id89=9990string;K1=V1;23234234234tttttttt13_key4=--x;id5=true;impid=23123;url=domain_co_uk | kv13_key4=--x                  |
| K1111=V1111;K1=V1;kv13_key4=--xsomething;id5=true;impid=23123123;location=domain_co_uk                   | kv13_key4=--xsomething         |
| ;id5=true;T6791=V1111;K1=V1;kv13_key4=--xxxxxsomething123;impid=23123                                    | kv13_key4=--xxxxxsomething123  |

Does this regex work:此正则表达式是否有效:

(?<=kv13_key4=)[^;]+(?=;)

It captures everything between 'kv13_key4=' and the nearest ';'它捕获 'kv13_key4=' 和最近的 ';' 之间的所有内容

Your REGEX_EXTRACT would look like:您的 REGEX_EXTRACT 看起来像:

REGEXP_EXTRACT(customtargeting,r'(?<=kv13_key4=)[^;]+(?=;)')

Consider below考虑以下

select *, regexp_extract(customtargeting, r'kv13_key4=[^;]+') as Extracted
from your_table            

if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM