简体   繁体   中英

How to extract just numeric value with REGEXP_EXTRACT in BigQuery?

I am trying to extract just the numbers from a particular column in BigQuery. The fields concerned have this format: value = "Livraison_21J|Relais_19J" or "RELAIS_15 DAY" I am trying to extract the number of days for each value preceeded by the keyword "Relais". The days range from 1 to 100.

I used this to do so:

SELECT CAST(REGEXP_EXTRACT(delivery, r"RELAIS_([0-9]+J)") as string) as relayDay
FROM TABLE

I want to be able to extract just the number of days regardless of the the string that comes after the numbers, be it "J" or "DAY".

Sample data :

RETRAIT_2H|LIVRAISON_5J|RELAIS_5J    |     5J

LIVRAISON_21J|RELAIS_19J             |     19J

LIVRAISON_21J|RELAIS_19J             |     19J

RETRAIT_2H|LIVRAISON_3J|RELAIS_3J    |     3J

You may use

REGEXP_EXTRACT(delivery, r"(?:.*\D)?(\d+)\s*(?:J|DAY)")

See the regex demo

Details

  • (?:.*\\D)? - an optional non-capturing group that matches 0+ chars other than line break chsrs as many as possible and then a non-digit char (this pattern is required to advance the index to the location right before the last sequence of digits, not the last digit)
  • (\\d+) - Group 1 (just what the REGEXP_EXTRACT returns): one or more digits
  • \\s* - 0+ whitespaces
  • (?:J|DAY) - J or DAY substrings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM