[英]INSERT INTO table SELECT Redshift super slow
We have a large table, that we need to do a DEEP COPY on it.我们有一张大桌子,我们需要对其进行深度复制。 Since we don't have enough empty disk space to make it in one statements I've tried to make it in batches.
由于我们没有足够的空磁盘空间来在一个语句中创建它,因此我尝试批量创建它。 But the batches seem to run very very slowly.
但是批次似乎运行得非常非常缓慢。
I'm running something like this:我正在运行这样的东西:
INSERT INTO new_table
SELECT * FROM old_table
WHERE creation_date between '2018-01-01' AND '2018-02-01'
Even though the query returns small amount of lines ~ 1K即使查询返回少量行 ~ 1K
SELECT * FROM old_table
WHERE creation_date between '2018-01-01' AND '2018-02-01'
The INSERT
query take around 50 minutes to complete. INSERT
查询大约需要 50 分钟才能完成。
The old_table
has ~286M rows and ~400 columns old_table
有 ~286M 行和 ~400 列
creation_date
is one of the SORTKEY
s creation_date
是SORTKEY
之一
Explain plan looks like:解释计划看起来像:
XN Seq Scan on old_table (cost=0.00..4543811.52 rows=178152 width=136883)
Filter: ((creation_date <= '2018-02-01'::date) AND (creation_date >= '2018 01-01'::date))
My question is:我的问题是:
INSERT
query to take this long? INSERT
查询花费这么长时间的原因可能是什么?In my opinion, following are two possibilities--- though if you could add more details to your question will be great.在我看来,以下是两种可能性——不过,如果您能在问题中添加更多细节,那就太好了。
creation_date
sortkey?creation_date
排序键?old_table
, if so, you must to vacuum first do VACUUM DELETE Only old_table
then, do select queries.old_table
进行了大量更新,如果是这样,您必须先真空执行VACUUM DELETE Only old_table
然后,执行选择查询。 Other option, you might be doing S3 way, but not sure do you want to do it.其他选项,您可能正在使用 S3 方式,但不确定您是否想要这样做。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.