简体   繁体   中英

BigQuery REGEXP_EXTRACT from URL - extract parameters values

I need to use REGEXP_EXTRACT on various URLs i have in BigQuery and extract different strings from them.

For exmaple, i have this URL:

url = https://www.whatever.com/record-a-beautiful-and-professional-voice-over?sec_context=recommendation&context_alg=nodes&sec_context_referrer=search

I want to use the BigQuery REGEXP_EXTRACT function and extract the string that comes after the parameter named context_alg= (presented after the first & in the URL). Meaning - my output will be nodes .

( context_alg is a parameter in the URL and always has the same name)

So actually I need to use something like:

REGEXP_EXTRACT(url, "REGEXP that bring back 'nodes')

Thank you !

尝试下面的BigQuery

REGEXP_EXTRACT(url, r'context_alg=([^?&#]*)')  

If you need to extract all parameters from a URL, you can also use REGEXP_EXTRACT_ALL as follows:

REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params

This will return the result as an array (see How to extract URL parameters as ARRAY in Google BigQuery ):

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM