简体   繁体   English

如何在bigquery中使用正则表达式获取值字符串

[英]How to get value string with regexp in bigquery

Hi i have string in BigQuery column like this嗨,我在 BigQuery 列中有这样的字符串

cancellation_amount: 602000
after_cancellation_transaction_amount: 144500
refund_time: '2022-07-31T06:05:55.215203Z'
cancellation_amount: 144500
after_cancellation_transaction_amount: 0
refund_time: '2022-08-01T01:22:45.94919Z'

i already using this logic to get cancellation_amount我已经使用这个逻辑来获取cancellation_amount

regexp_extract(file,r'.*cancellation_amount:\s*([^\n\r]*)')

but the output only amount 602000, i need the output 602000 and 144500 become different column但 output 仅数量 602000,我需要 output 602000 和 144500 成为不同的列

Appreciate for helping感谢帮助

If your lines in the input (which will eventually become columns) are fixed you can use multiple regexp_extract s to get all the values.如果输入中的行(最终将成为列)是固定的,则可以使用多个regexp_extract来获取所有值。

SELECT
    regexp_extract(file,r'cancellation_amount:\s*([^\n\r]*)') as cancellation_amount
    regexp_extract(file,r'. after_cancellation_transaction_amount:\s*([^\n\r]*)') as after_cancellation_transaction_amount
FROM table_name

One issue I found with your regex expression is that .*cancellation_amount won't match after_cancellation_transaction_amount .我在您的正则表达式中发现的一个问题是.*cancellation_amountafter_cancellation_transaction_amount不匹配。

There is also a function called regexp_extract_all which returns all the matches as an array which you can later explode into columns, but if you have finite values separating them out in different columns would be a easier.还有一个名为regexp_extract_all的 function 将所有匹配项作为一个数组返回,您可以稍后将其分解为列,但如果您有有限值将它们分隔在不同的列中会更容易。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM