简体繁体中英

How to optimize AWS DMS MySql Aurora to Redshift replication?

原文 2017-12-27 08:22:55 5 2 amazon-web-services/ amazon-redshift/ amazon-aurora/ aws-dms

I've been using AWS DMS to perform ongoing replication from MySql Aurora to Redshift. However, the ongoing replication is causing constant 25-30% CPU load on the target. This is because it produces many small files on S3 and loads/processes them non-stop. Redshift is not really designed for handling large number of small tasks.

In order to optimize, i've made it so that the process starts at the beginning of each hour, waits till the target is in-sync, and then stops. So, instead of working continually, it works for 5-8 minutes at the beginning of each hour. Even so, it is still very slow and unoptimized because it still has to process hundreds of small s3 files, only in shorter timespan.

Can this be optimized further? Is there a way to tell DMS to buffer these changes for larger period of time, and not produce fewer larger instead of many small s3 files? We really don't mind having higher target latency.

The amount of data transferred between Aurora and Redshift is rather small. There are around ~20K changes per hour, and we're using 4-node dc1.large redshift cluster. It should be able to handle those 20K changes in matter of seconds, not minutes

2 answers

maybe, you can try BatchApplyTimeoutMin and BatchApplyTimeoutMax. https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.CustomizingTasks.TaskSettings.ChangeProcessingTuning.html

BatchApplyTimeoutMin sets the minimum amount of time in seconds that AWS DMS waits between each application of batch changes. The default value is 1.

You can change the value to 1200, even 3600.

在目标设置中增加maxFileSize - https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Redshift.html

AWS DMS for MySQL Aurora

Is AWS Redshift to PostgreSQL the same as AWS Aurora to MySQL?

AWS DMS and Redshift

AWS DMS Replication task

How to translate Aurora Mysql ddl to Redshift ddl

Replication Slots error while replicating data from RDS Postgres(read replica) to Redshift using AWS DMS

AWS RDS Aurora Reader Replication to External MySql Instance

AWS DMS replication task from Postgres RDS to Redshift getting AccessDenied on S3 bucket

AWS DMS Start replication task

Issue with AWS DMS continuous replication

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question AWS DMS for MySQL Aurora Is AWS Redshift to PostgreSQL the same as AWS Aurora to MySQL? AWS DMS and Redshift AWS DMS Replication task How to translate Aurora Mysql ddl to Redshift ddl Replication Slots error while replicating data from RDS Postgres(read replica) to Redshift using AWS DMS AWS RDS Aurora Reader Replication to External MySql Instance AWS DMS replication task from Postgres RDS to Redshift getting AccessDenied on S3 bucket AWS DMS Start replication task Issue with AWS DMS continuous replication

Related Tags

How to optimize AWS DMS MySql Aurora to Redshift replication?

Question

2 answers

solution1
1 2019-01-18 02:29:52

solution2
0 2018-01-04 11:08:47

How to optimize AWS DMS MySql Aurora to Redshift replication?

Question

2 answers

solution1 1 2019-01-18 02:29:52

solution2 0 2018-01-04 11:08:47

solution1
1 2019-01-18 02:29:52

solution2
0 2018-01-04 11:08:47