简体   繁体   English

Amazon Redshift如何从s3复制并设置job_id

[英]Amazon Redshift how to copy from s3 and set a job_id

Amazon Redshift provides the ability to load table data from s3 objects using the "Copy" command. Amazon Redshift提供了使用“复制”命令从s3对象加载表数据的功能。 Is their a way to use the copy command, but also set additional "col=CONSTANT" for each inserted row. 它们是一种使用复制命令的方法,但也为每个插入的行设置了额外的“col = CONSTANT”。

I want to set a job_id (which is not in the source data) on each copied row, and I think it would be a shame to have to execute a few million inserts just so each row has a job attribute, when "copy" gets me 99% of the way there with much better performance. 我想在每个复制的行上设置一个job_id(不在源数据中),我认为必须执行几百万个插入,这样每个行都有一个job属性,当“copy”获得时,这将是一种耻辱我99%的方式有更好的表现。

Maybe there is a more clever solution? 也许有一个更聪明的解决方案?

If you want all your rows added in a single COPY command to have the same value of job_id, then you may COPY data into staging table, then add job_id column into that table, then insert all data from the staging table into final table like: 如果希望在单个COPY命令中添加的所有行具有相同的job_id值,则可以将数据复制到临时表中,然后将job_id列添加到该表中,然后将登台表中的所有数据插入到最终表中,如:

CREATE TABLE destination_staging (LIKE destination);
ALTER TABLE destination_staging DROP COLUMN job_id;
COPY destination_staging FROM 's3://data/destination/(...)' (...)
ALTER TABLE destination_staging ADD COLUM job_id INT DEFAULT 42;
INSERT INTO destination SELECT * FROM destination_staging ORDER BY sortkey_column;
DROP TABLE destination_staging;
ANALYZE TABLE destination;
VACUUM destination;

ANALYZE and VACUUM are not necessary, but highly recommended in order to update query analyzer and put all new data into correct positions. ANALYZE和VACUUM不是必需的,但强烈建议更新查询分析器并将所有新数据放入正确的位置。

It seems there is no option to do post/pre processing with the COPY command itself. 似乎没有选项可以使用COPY命令本身进行后期/预处理。 Therefore your best option seems to be to do preprocessing to the files you intend to COPY into Redshift, add the jobid and then load them into Redshift. 因此,您最好的选择似乎是对要打算COPY到Redshift的文件进行预处理,添加jobid然后将它们加载到Redshift中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM