简体   繁体   中英

Implementation of initial database replication

How various databases implement copying data (replication) to a new instance when it is added to the replication setup? I. e., when we add a new instance, how is the data loaded into it?

There are a lot of information about ways of replication, but they are explained in cases when the target database instance already has the same data from its source. But not when there is a new initially empty instance of database

There are basically 3 approaches here.

First you start capturing the changes from the source database using CDC tool. Since the target database is not yet created you store all the changes to apply them later.

Depending on the architecture you can:

  1. If you have 1:1 copy

    Take a backup of the source database, backup it, and restore it to the target database. Having some point in time of the backup you start applying the changes from the timestamp when the database backup was created.

    Assuming you have a consistent backup of the database you would have the same data on the target but delayed compared to the source.

  2. If you have a subset of the tables or a different vendor

    The same approach like in 1. but you don't backup & restore the full database but just a list of tables. You can also restore the database backup in a temporary location, export part of the tables (or not full tables but just subset of columns), and next load them to the target.

    When the target is initially prepared - you start applying the changers from the source to the target.

  3. No source database snapshot available

    If you can't get a snapshot of the replication tool often contains a method to work with that. Depending on the tool the function is named AUTOCORRECTION (SAP/Sybase Replication Server), HANDLECOLLISIONS (Oracle GoldenGate). This method basically means that the replication tool has a full image of the UPDATE operation, and when the record does not exist in the target - it is cerated. When the row for DELETE does not exists - the operation is ignored. When the rows already exists for INSERT - the operation is ignored.

    To get a consistent state of the target you work in mode described here for some time until the point when you have data in sync, and next switch to regular replication.

    One thing to mention about this mode is that you need to make sure that during the reconciliation operation the CDC must provide full UPDATE content for rows. If the UPDATE just contains modified columns - you would not be able to create INSERT command (with all column values) if the row is missing.

Of course the replication tool you use can incorporate the solution described above and do the task instead of you - automatically.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM