[英]How to pipe data from AWS Postgres RDS to S3 (then Redshift)?
I'm using AWS data pipeline service to pipe data from a RDS MySql
database to s3
and then on to Redshift
, which works nicely. 我正在使用AWS数据管道服务将数据从
RDS MySql
数据库管道传输到s3
,然后再传输到Redshift
,效果很好。
However, I also have data living in an RDS Postres
instance which I would like to pipe the same way but I'm having a hard time setting up the jdbc-connection. 但是,我也有数据存在于
RDS Postres
实例中,我想以相同的方式进行管道RDS Postres
,但是我很难设置jdbc-connection。 If this is unsupported, is there a work-around? 如果不支持,是否有解决方法?
"connectionString": "jdbc:postgresql://THE_RDS_INSTANCE:5432/THE_DB”
Nowadays you can define a copy-activity to extract data from a Postgres RDS instance into S3. 如今,您可以定义一个复制活动,以将数据从Postgres RDS实例提取到S3中。 In the Data Pipeline interface:
在数据管道界面中:
this doesn't work yet. 这还行不通。 aws hasnt built / released the functionality to connect nicely to postgres.
aws hasnt尚未构建/发布了可以很好地连接到postgres的功能。 you can do it in a shellcommandactivity though.
您可以通过shellcommandactivity来完成。 you can write a little ruby or python code to do it and drop that in a script on s3 using scriptUri.
您可以编写一些ruby或python代码来做到这一点,然后使用scriptUri将其放在s3上的脚本中。 you could also just write a psql command to dump the table to a csv and then pipe that to OUTPUT1_STAGING_DIR with "staging: true" in that activity node.
您还可以只编写一个psql命令以将表转储到csv,然后在该活动节点中通过“ staging:true”将其通过管道传输到OUTPUT1_STAGING_DIR。
something like this: 像这样的东西:
{
"id": "DumpCommand",
"type": "ShellCommandActivity",
"runsOn": { "ref": "MyEC2Resource" },
"stage": "true",
"output": { "ref": "S3ForRedshiftDataNode" },
"command": "PGPASSWORD=password psql -h HOST -U USER -d DATABASE -p 5432 -t -A -F\",\" -c \"select blah_id from blahs\" > ${OUTPUT1_STAGING_DIR}/my_data.csv"
}
i didn't run this to verify because it's a pain to spin up a pipeline :( so double check the escaping in the command. 我没有运行它来进行验证,因为旋转管道很麻烦:(因此,请仔细检查命令中的转义。
look into the new stuff aws just launched on parameterized templating data pipelines: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html . 查看刚刚在参数化模板数据管道上发布的AWS新东西: http : //docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html 。 it looks like it will allow encryption of arbitrary parameters.
看起来它将允许加密任意参数。
AWS now allow partners to do near real time RDS -> Redshift inserts. AWS现在允许合作伙伴进行近实时RDS-> Redshift插入。
https://aws.amazon.com/blogs/aws/fast-easy-free-sync-rds-to-redshift/ https://aws.amazon.com/blogs/aws/fast-easy-free-sync-rds-to-redshift/
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.