简体   繁体   中英

Regex to match the beginning from the end

I'm trying to come up with a regex that selects A|B|C from the first string below. The difficulty it that it's not always 3 values and the only static thing I know is, that I want to select all values that are left from the 7th occurrence of | from the end.

A|B|C|Lion|Zebra|Date|Whatever|Something|122|Scooby
A|Lion|Zebra|Date|Whatever|Something|122|Dooby
A|B|C|D|E|Lion|Zebra|Date|Whatever|Something|122|Doo

So the result should be:

A|B|C
A
A|B|C|D|E

Thanks

You can use REGEXP_EXTRACT as

SELECT REGEXP_EXTRACT(str, r'^(.*)(?:\|[^|]+){7}$')

Note: this function available in both BigQuery Standard SQL and Legacy SQL, but using Standard SQL is highly recommended

Below example is for BigQuery Standard SQL and using dummy data from your question

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'A|B|C|Lion|Zebra|Date|Whatever|Something|122|Scooby' str UNION ALL
  SELECT 'A|Lion|Zebra|Date|Whatever|Something|122|Dooby' UNION ALL
  SELECT 'A|B|C|D|E|Lion|Zebra|Date|Whatever|Something|122|Doo'
)
SELECT 
  REGEXP_EXTRACT(str, r'^(.*)(?:\|[^|]+){7}$') result
FROM `project.dataset.table`   

as output is (as expected)

Row result   
1   A|B|C    
2   A    
3   A|B|C|D|E    

Try this pattern to delete matched part: ([a-zA-Z0-9]+\\|?){7}$ .

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM