简体   繁体   中英

How to do cross-database information syncing?

I am designing a directory where data in multiple sources will have to override the data in other sources when altered or updated. Some of the databases are MySQL, SQL Server and some of the info will be AD/LDAP.

My question is this: is there a design pattern for this type of database propagation, to reduce traffic and prevent errors? Also this project will be in PHP, so if anyone knows of a similar open source project I could adapt from, that would be nice too. There will probably have to be some logic between some of the databases.

You'll need some way to flag the records to be synced. We use a system like that, in which each table to sync has a column that keeps the syncstate. When a record is modified, it modifies its state too (in a trigger) and a synchronization tool queries for modified records every few minutes.

Disadvantage is that you will need lots of code to handle this correctly, especially because you cannot delete records directly. The sync tool first needs to know and needs to perform the actual delete. Besides that, it is hard to build a good queue this way, so if records are synced before their parents are, you'll get an error. And every table that must be synced needs an extra column.

So now there is a new solution about to be implemented. This solution uses a separate table for the queue. The queue contains pointers to records in other tables (primary key value and a reference to table name/field name). This queue is now the only table to monitor changes, so all a table need to do is implement a single trigger that marks the modified records as modified in the queue. Because it is a single queue in a separate table, this adds solutions for the problems I mentioned earlier:

  • Records can be deleted immediately. The sync tool finds an id in the queue, verifies that it does not longer exist, so it deletes it from the other database too
  • Child parent dependancies are automatically solved. A new parent will be in the queue before its children and a deleted parent will be there behind its children. The only problem you may find in cross linked records, although deferred commits might be a solution for that.
  • No need for extra column in all tables. Only a single queue, some helper tables, and a single trigger containing a single function call on each table to be synced.

Unfortunately we've not fully implemented this solution, so I can't tell you if it will ectually work better, though the tests definately suggest so.

Mind that this system does a one on one copy of records. I think that is the best approch too. Copy the data, and then (afterwards) process it on the target server. I don't think it is a good idea to process the data while copying it. If anything goes wrong, you'll have a hell of a job debugging and restoring/recalculating data.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM