简体   繁体   中英

Finding strings between dashes using REGEXP_EXTRACT in Bigquery

In Bigquery, I am trying to find a way to extract particular segments of a string based on how many dashes come before it. The number of total dashes in the string will always be the same. For example, I could be looking for the string after the second dash and before the third dash in the following string:

abc-defgh-hij-kl-mnop

Currently, I am using the following regex to extract, which counts the dashes from the back:

([^-]+)(?:-[^-]+){2}$

The problem is that if there is nothing in between the dashes, the regex doesn't work. For example, something like this returns null :

abc-defgh-hij--mnop

Is there a way to use regex to extract a string after a certain number of dashes and cut it off before the subsequent dash?

Thank you!

Below is for BigQuery Standrd SQL

The simplest way in your case is to use SPLIT and OFFSET as in below example

SELECT SPLIT(str, '-')[OFFSET(3)]   

above will return empty string for abc-defgh-hij--mnop

to prevent error in case of calling non-existing element - better to use SAFE_OFFSET

SELECT SPLIT(str, '-')[SAFE_OFFSET(3)]   

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM