简体   繁体   English

资源超过 BigQuery

[英]Resources exceeded BigQuery

When running the following query I got the error:运行以下查询时出现错误:

Resources exceeded during query execution: The query could not be executed in the allotted memory.查询执行期间超出资源:无法在分配的内存中执行查询。 Peak usage: 158% of limit.峰值使用:限制的 158%。 Top memory consumer(s): sort operations used for analytic OVER() clauses: 98% other/unattributed: 2%顶级内存消费者:用于分析 OVER() 子句的排序操作:98% 其他/未归因:2%

select *, row_number() over(PARTITION BY Column_A ORDER BY Column_B)
from
(SELECT
*
FROM
  Table_1 UNION ALL
SELECT
  *
FROM
  Table_2 UNION ALL
SELECT
  *
FROM
  Table_3
)

Can someone help me how to change this query or is there possibility that we can change the memory limit in bigquery?有人可以帮助我如何更改此查询,或者我们是否可以更改 bigquery 中的内存限制?

Welcome Aaron,欢迎亚伦,

This error means BigQuery is unable to process the whole query due to memory limits, the ORDER BY function is pretty memory intensive, try removing it and I would expect your query to run fine.此错误意味着 BigQuery 由于内存限制无法处理整个查询, ORDER BY函数占用大量内存,请尝试将其删除,我希望您的查询能够正常运行。

If you need results ordered, try writing the unordered query out to a table then running a new query on this table to order the results.如果您需要对结果进行排序,请尝试将无序查询写入一个表,然后在该表上运行一个新查询来对结果进行排序。

If you're interested.如果你有兴趣。 here's an interesting article on how and BigQuery executes in memory: https://cloud.google.com/blog/products/gcp/in-memory-query-execution-in-google-bigquery这是一篇关于 BigQuery 如何在内存中执行的有趣文章: https ://cloud.google.com/blog/products/gcp/in-memory-query-execution-in-google-bigquery

I don't believe you can override or change this memory limit, but happy to be proven wrong.我不相信您可以覆盖或更改此内存限制,但很高兴被证明是错误的。

Make sure your ORDER BY is being executed in real last step, additionally, consider to use a LIMIT clause to avoid “ Resources Exceeded ” or “ Response too large ” fails.确保你的ORDER BY在真正的最后一步被执行,此外,考虑使用LIMIT子句来避免“ Resources Exceeded ”或“ Response too large ”失败。

My primary recommendation here is to make sure to use partitioning and clustering.我在这里的主要建议是确保使用分区和集群。

Partitions apply to date field so if your Table_1, Table_2... has one, partition on it.分区适用于日期字段,因此如果您的 Table_1、Table_2... 有一个分区。

Clustering also greatly helps the memory cost of OVER clauses with ORDER BY because it sorts storage blocks ( BigQuery docs )集群也极大地帮助 OVER 子句的内存成本与 ORDER BY 因为它对存储块进行排序( BigQuery 文档

To make the most of both of the above, I would also replace your UNION ALL sub-query with a temporary table.为了充分利用上述两者,我还将用临时表替换您的 UNION ALL 子查询。 Storing the result of the UNION ALL to memory, doing the partitioning+clustering of the resulting dataset and only then computing the rank is much more efficient in terms of memory and storage ( Medium article )将 UNION ALL 的结果存储到内存中,对结果数据集进行分区+聚类,然后才计算排名在内存和存储方面效率更高中篇文章

Your final statement should look something like:您的最终陈述应类似于:

CREATE TEMP TABLE tmp
PARTITION BY date
CLUSTER BY column_A, column_B
AS
SELECT
*
FROM
  Table_1 UNION ALL
SELECT
  *
FROM
  Table_2 UNION ALL
SELECT
  *
FROM
  Table_3
;

select *, row_number() over(PARTITION BY Column_A ORDER BY Column_B) from tmp

I've encountered this before and turns out I was trying to partition by a column with "NULL" values.我以前遇到过这个,结果我试图按具有“NULL”值的列进行分区。 Removing the NULL records worked!删除 NULL 记录有效!

You can try OVER without using ORDER BY您可以在不使用 ORDER BY 的情况下尝试 OVER

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 BigQuery:超出资源 - Google 表格服务超载 - BigQuery: Resources exceeded - Google Sheets service overloaded 在 BigQuery SQL 中,如何跟踪资源使用情况以避免“查询执行期间超出资源”错误 - In BigQuery SQL, how to track resource usage to avoid "Resources exceeded during query execution" error 查询执行期间超出了 Google BigQuery 资源。 如何在 SQL 中拆分带有分区的大 window 帧 - Google BigQuery Resources exceeded during query execution. How to split large window frames with partition in SQL 编写 Google Apps 脚本以在 BigQuery“查询执行期间超出资源”时引发错误 - Program Google Apps Script to throw error when BigQuery "Resources exceeded during query execution" Google BigQuery - 了解配额和限制:项目超出扫描的免费查询字节配额 - Google BigQuery - understanding quotas and limits: project exceeded quota for free query bytes scanned 查询失败错误:查询执行期间资源超出:无法在分配的 memory 中执行查询 - Query Failed Error: Resources exceeded during query execution: The query could not be executed in the allotted memory 如何在新的 Google BigQuery Web UI 中刷新数据集/资源? - How can I refresh datasets/resources in the new Google BigQuery Web UI? Bigquery 中的 For 循环 - For loop in Bigquery 当前在 BigQuery 中? - CURRENT in BigQuery? GoolgeBigQuery - 超出速率限制 - GoolgeBigQuery - Exceeded rate limits
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM