简体   繁体   English

如何将聚合函数应用于从 Google BigQuery 中的 JSON 提取的数据?

[英]How can I apply aggregate functions to data extracted from JSON in Google BigQuery?

I am extracting JSON data out of a BigQuery column using JSON_EXTRACT .我正在使用JSON_EXTRACTBigQuery列中提取 JSON 数据。 Now I want to extract lists of values and run aggregate functions (like AVG ) against them.现在我想提取值列表并针对它们运行聚合函数(如AVG )。 Testing the JsonPath expression .objects[*].v succeeds on http://jsonpath.curiousconcept.com/ .http://jsonpath.curiousconcept.com/上测试 JsonPath 表达式.objects[*].v成功。 But the query:但是查询:

SELECT
  JSON_EXTRACT(json_column, "$.id") as id,
  AVG(JSON_EXTRACT(json_column, "$.objects[*].v")) as average_value
FROM [tablename]

throws a JsonPath parse error on BigQuery.在 BigQuery 上引发JsonPath 解析错误 Is this possible on BigQuery?这在 BigQuery 上可行吗? Or do I need to preprocess my data in order to run aggregate functions against data inside of my JSON?或者我是否需要预处理我的数据才能对我的 JSON 中的数据运行聚合函数?

My data looks similar to this:我的数据看起来类似于:

# Record 1
{
  "id": "abc",
  "objects": [
    {
      "id": 1,
      "v": 1
    },
    {
      "id": 2,
      "v": 3
    }
  ]
}
# Record 2
{
  "id": "def",
  "objects": [
    {
      "id": 1,
      "v": 2
    },
    {
      "id": 2,
      "v": 5
    }
  ]
}

This is related to another question .这与另一个问题有关

Update: The problem can be simplified by running two queries.更新:可以通过运行两个查询来简化问题。 First, run JSON_EXTRACT and save the results into a view.首先,运行JSON_EXTRACT并将结果保存到视图中。 Secondly, run the aggregate function against this view.其次,针对该视图运行聚合函数。 But even then I need to correct the JsonPath expression $.objects[*].v to prevent the JSONPath parse error .但即便如此,我仍需要更正 JsonPath 表达式$.objects[*].v以防止JSONPath parse error

Leverage SPLIT() to pivot repeatable fields into separate rows.利用 SPLIT() 将可重复字段转换为单独的行。 Also might be easier/cleaner to put this into a subquery and put AVG outside:将其放入子查询并将 AVG 放在外面也可能更容易/更干净:

SELECT id, AVG(v) as average 
FROM (
SELECT 
    JSON_EXTRACT(json_column, "$.id") as id, 
    INTEGER( 
      REGEXP_EXTRACT(
        SPLIT(
          JSON_EXTRACT(json_column, "$.objects")
          ,"},{"
          )
        ,r'\"v\"\:([^,]+),')) as v FROM [mytable] 
)
GROUP BY id;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从未嵌套的数据(BigQuery中的Google Analytics(分析)数据)返回正确的汇总总数 - How to return a correct aggregate total from unnested data (Google Analytics data in BigQuery) 如何在Google BigQuery中进行汇总和分组 - How to Aggregate and Group By in Google BigQuery SQL:如何在 SELECT 语句中为特定聚合函数应用 WHERE 子句过滤器? - SQL: How can I apply WHERE clause filters for specific aggregate functions in the SELECT statement? Bigquery:如何根据特定时间范围聚合多个列的数据? - Bigquery: How can I aggregate the data of several columns according to a specific time range? 如何将提取的Json数据导入数据库 - How do i import the extracted Json data to database 如何在Google BigQuery中旋转数据集? - How can I pivot dataset in Google BigQuery? 如何将数据从 BigQuery 移动到 BigQuery 中的同一数据库但架构不同? - How can I move data from BigQuery to same database in BigQuery but different schema? 如何从模型返回汇总数据? - How can I return aggregate data from my model? 如何分组两个表并应用聚合函数? - How to group by on two tables and apply aggregate functions? Google BigQuery SQL:从 JSON(列表和数组)中提取数据到列中 - Google BigQuery SQL: Extract data from JSON (list and array) into columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM