简体   繁体   English

Regexp_Extract BigQuery 直到“|”

[英]Regexp_Extract BigQuery anything up to “|”

I'm fairly new to coding and I was wondering if you could give me a hand writing some regular expression for BigQuery SQL.我对编码还很陌生,我想知道您是否可以帮我为 BigQuery SQL 编写一些正则表达式。

Basically I would like to extract everything before the bar sign "|"基本上我想提取条形符号“|”之前的所有内容for one of my column.我的专栏之一。

Example:例子:

  • Source string: bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed源字符串:bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed

  • Desired output: bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah所需的 output:bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah

I thought about using the REGEXP_EXTRACT(string, delimiter) function but I'm totally unable to write some regex (LOL).我考虑过使用 REGEXP_EXTRACT(string, delimiter) function 但我完全无法编写一些正则表达式(LOL)。 Therefore I had a look over Stack, and have found stuff like:因此,我查看了 Stack,并发现了以下内容:

SELECT REGEXP_EXTRACT( String_Name , "\S*\s*\|" ) ,
# or 
SELECT REGEXP_EXTRACT( String_Name , '.+?(?=|)')

But every time I get error messages like " invalid perl operator: (?= " or "Illegal escape space"但是每次我收到诸如“无效的 perl 运算符:(?= ”或“非法转义空间”之类的错误消息时

Would you have any suggestions on why I get these messages and/or how could I proceed to extract these strings?您对我收到这些消息的原因和/或如何继续提取这些字符串有什么建议吗?

Many many thanks in advance <3非常感谢提前<3

You can use SPLIT instead:您可以改用SPLIT

SELECT SPLIT("bla-BLABLA-cid=123456_sept1220_blabla--potato-Blah|someMore_string_stuff-IDontNeed", "|")[OFFSET(0)]

在此处输入图像描述

Prefix the pattern string with r :使用r为模式字符串添加前缀:

SELECT REGEXP_EXTRACT(String_Name, r'\S*\s*\|')

This is the syntax for a raw string constant.这是原始字符串常量的语法。 You can review what this means in the documentation .您可以在文档中查看这意味着什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM