简体   繁体   English

有没有办法在BigQuery Standard SQL中解析XML标记?

[英]Is there a way to parse XML tags in BigQuery Standard SQL?

I have read that it's a bad idea to parse XML/HTML using regular expressions . 我已经读到, 使用正则表达式解析XML / HTML是一个坏主意 The alternative suggestion is to use an XML parser. 另一种建议是使用XML解析器。 Does one exist in the BigQuery Standard SQL library? BigQuery标准SQL库中是否存在一个?

Here is the documentation to how to use Javascript UDFs in BigQuery like Elliot has mentioned. 这是有关如何在Elliot提到的BigQuery中使用Javascript UDF的文档。

https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions

I imagine the UDF might look something like 我认为UDF可能看起来像

CREATE TEMPORARY FUNCTION XML(x STRING)
RETURNS STRING
  LANGUAGE js AS """
  var data = fromXML(x);
  return data.title;
"""
OPTIONS(
library="gs://<BUCKET_NAME>/from-xml.min.js"
);
SELECT XML(a) FROM UNNEST(["<title>Title of Page</title>"]) as a

Where from-xml.min.js is from this library and loaded into your gcs account from-xml.min.js来自库,并已加载到您的gcs帐户中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM