[英]BigQuery REGEXP_EXTRACT from URL - extract parameters values
I need to use REGEXP_EXTRACT on various URLs i have in BigQuery and extract different strings from them. 我需要在BigQuery中使用的各种URL上使用REGEXP_EXTRACT ,并从中提取不同的字符串。
For exmaple, i have this URL: 例如,我有这个URL:
url = https://www.whatever.com/record-a-beautiful-and-professional-voice-over?sec_context=recommendation&context_alg=nodes&sec_context_referrer=search
url =
https://www.whatever.com/record-a-beautiful-and-professional-voice-over?sec_context=recommendation&context_alg=nodes&sec_context_referrer=search
I want to use the BigQuery REGEXP_EXTRACT
function and extract the string that comes after the parameter named context_alg=
(presented after the first &
in the URL). 我想使用BigQuery
REGEXP_EXTRACT
函数并提取名为context_alg=
的参数之后的字符串(在URL中的第一个&
之后显示)。 Meaning - my output will be nodes
. 含义 - 我的输出将是
nodes
。
( context_alg
is a parameter in the URL and always has the same name) (
context_alg
是URL中的参数,并且始终具有相同的名称)
So actually I need to use something like: 所以实际上我需要使用类似的东西:
REGEXP_EXTRACT(url, "REGEXP that bring back 'nodes')
Thank you ! 谢谢 !
尝试下面的BigQuery
REGEXP_EXTRACT(url, r'context_alg=([^?&#]*)')
If you need to extract all parameters from a URL, you can also use REGEXP_EXTRACT_ALL as follows: 如果需要从URL中提取所有参数,还可以使用REGEXP_EXTRACT_ALL ,如下所示:
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params
This will return the result as an array (see How to extract URL parameters as ARRAY in Google BigQuery ): 这将以数组形式返回结果(请参阅如何在Google BigQuery中将URL参数提取为ARRAY ):
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.