简体   繁体   English

来自URL的BigQuery REGEXP_EXTRACT - 提取参数值

[英]BigQuery REGEXP_EXTRACT from URL - extract parameters values

I need to use REGEXP_EXTRACT on various URLs i have in BigQuery and extract different strings from them. 我需要在BigQuery中使用的各种URL上使用REGEXP_EXTRACT ,并从中提取不同的字符串。

For exmaple, i have this URL: 例如,我有这个URL:

url = https://www.whatever.com/record-a-beautiful-and-professional-voice-over?sec_context=recommendation&context_alg=nodes&sec_context_referrer=search url = https://www.whatever.com/record-a-beautiful-and-professional-voice-over?sec_context=recommendation&context_alg=nodes&sec_context_referrer=search

I want to use the BigQuery REGEXP_EXTRACT function and extract the string that comes after the parameter named context_alg= (presented after the first & in the URL). 我想使用BigQuery REGEXP_EXTRACT函数并提取名为context_alg=的参数之后的字符串(在URL中的第一个&之后显示)。 Meaning - my output will be nodes . 含义 - 我的输出将是nodes

( context_alg is a parameter in the URL and always has the same name) context_alg是URL中的参数,并且始终具有相同的名称)

So actually I need to use something like: 所以实际上我需要使用类似的东西:

REGEXP_EXTRACT(url, "REGEXP that bring back 'nodes')

Thank you ! 谢谢 !

尝试下面的BigQuery

REGEXP_EXTRACT(url, r'context_alg=([^?&#]*)')  

If you need to extract all parameters from a URL, you can also use REGEXP_EXTRACT_ALL as follows: 如果需要从URL中提取所有参数,还可以使用REGEXP_EXTRACT_ALL ,如下所示:

REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params

This will return the result as an array (see How to extract URL parameters as ARRAY in Google BigQuery ): 这将以数组形式返回结果(请参阅如何在Google BigQuery中将URL参数提取为ARRAY ):

在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM