[英]How to export data from AWS Aurora Postgres DB to Redshift?
I have a Postgres DB hosted on AWS Aurora from which I need to retrieve data and insert it into Redshift .我有一个托管在AWS Aurora上的Postgres DB ,我需要从中检索数据并将其插入Redshift 。
My current approach is as follows:我目前的做法如下:
OUTFILE
使用上面创建的Aurora连接,查询Aurora DB表并使用OUTFILE
将结果集作为CSV文件导出到S3I'm trying to optimize this by removing the S3 service and connecting Aurora to Redshift directly.我试图通过删除S3服务并将Aurora直接连接到Redshift来优化这一点。
Here's what I want to do for which I couldn't find resources:这是我找不到资源的我想做的事情:
Query the Aurora table - table1 and directly export the result set into the Redshift table - table1.查询Aurora表-table1,直接将结果集导出到Redshift表-table1中。
I'm not even sure if this is possible with the current system.我什至不确定当前系统是否可以做到这一点。 Any thoughts?有什么想法吗?
There are two ways to get data into an Amazon Redshift database:有两种方法可以将数据导入 Amazon Redshift 数据库:
COPY
command to load from Amazon S3从 Amazon S3 加载的COPY
命令INSERT
statement to insert data provided as part of the SQL statement INSERT
语句插入作为 SQL 语句的一部分提供的数据The COPY
method is recommended for normal data loading. COPY
方法推荐用于正常的数据加载。 It runs in parallel across slices and stores the data as efficiently as possible given that it is appending data.它跨切片并行运行,并尽可能高效地存储数据,因为它正在附加数据。
The INSERT
command is acceptable for a small number of inserts, but not a good idea for inserting lots of rows. INSERT
命令对于少量插入是可以接受的,但对于插入大量行不是一个好主意。 Where possible, insert multiple rows at a time.在可能的情况下,一次插入多行。 It is acceptable to use INSERT... SELECT
statements, which can insert bulk data from a different table in one operation.可以使用INSERT... SELECT
语句,它可以在一次操作中插入来自不同表的批量数据。
So, the only way to remove Amazon S3 from your operation is to code the data into an INSERT
statement, but this is not an optimal way to load the data.因此,从您的操作中删除 Amazon S3 的唯一方法是将数据编码到INSERT
语句中,但这不是加载数据的最佳方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.