简体   繁体   English

使用AWS Redshift达到数据限制

[英]Reaching data limit with AWS Redshift

I am playing around with some SQL code to implement a basic model I've been working on for a company I have recently joined. 我正在研究一些SQL代码,以实现我为最近加入的公司工作的基本模型。 In doing so, one of my SQL queries reached my data disk limit and it returned an error explaining this. 这样做时,我的一个SQL查询达到了我的数据磁盘限制,并且返回了一个错误来解释这一点。 My question is, is my running complicated, large models impeding the company's other applications on Redshift? 我的问题是,我运行的大型模型是否妨碍了Redshift公司的其他应用程序? Does my company get a notification that I have gone over my limit? 我的公司是否收到我已超过限额的通知?

EDIT: The SQL query was a number of nested SQL queries. 编辑:SQL查询是许多嵌套的SQL查询。 The error code was error: Disk Full code: 1016 错误代码为错误:磁盘完整代码:1016

First, try running this without running any other queries at the same time: 首先,尝试运行此命令而不同时运行任何其他查询:

select
 sum(capacity)/1024 as capacity_gbytes
,sum(used)/1024 as used_gbytes
,(sum(capacity) - sum(used))/1024 as free_gbytes 
from stv_partitions where part_begin=0;

that will show you the amount of free disk space that you have. 这将显示您的可用磁盘空间量。

My guess is that if your free space amount is fairly large you can be getting that error if you use ORDER BY in a statement that selects a lot of data (subquery or final query, it doesn't matter). 我的猜测是,如果您的可用空间量很大,那么在选择大量数据的语句(子查询或最终查询,这没关系)中使用ORDER BY ,可能会出现该错误。 In this case, all your data is sent to the leader node causing disk overflow. 在这种情况下,所有数据都会发送到引导节点,从而导致磁盘溢出。 Rewriting the statement without ORDER BY will help. 不使用ORDER BY重写语句将有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM