简体   繁体   English

如何实现数据库快速同步。 与只读源?

[英]How to achieve fast database sync. with a read-only source?

I've got a source database (Sybase), which is read-only and you can write to the database with a import file. 我有一个源数据库(Sybase),它是只读的,您可以使用导入文件将其写入数据库。 The other side is my own database (MSSQL) which has no limitations. 另一面是我自己的数据库(MSSQL),没有限制。

The main problem is that there are no timestamps on the first database and I don't have any access to change the source database. 主要问题是第一个数据库上没有时间戳,而且我无权更改源数据库。 So is there a engine/solution to get this sync. 因此,有没有一个引擎/解决方案来实现此同步。 done? 完成了吗

A diff algorithm might work, but it wouldn't be fast, in the sense that you would have to scan the whole source database for each synchronization. diff算法可能会起作用,但它并不快,因为您必须扫描整个源数据库以进行每次同步。

Basically you would do a full data extract, in an agreed upon, and stable, manner (ie. two such extracts with no changes between would produce identical output.) 基本上,您将以一种一致且稳定的方式进行完整的数据提取(即,两次这样的提取而没有任何变化将产生相同的输出)。

Then you compare that to the previous extract you did, and then you can find all the changes. 然后将其与您之前所做的提取进行比较,然后您可以找到所有更改。 Something slightly more intelligent than a pure text diff would be needed, to help determine that rows weren't just deleted+inserted, but in fact updated. 需要一些比纯文本差异稍微更智能的东西,以帮助确定行不只是被删除+插入,而是实际上被更新。

Unfortunately, if there is no way to ask the source database what the latest changes are, through, as you've pointed out, lack of timestamps, or similar mechanisms, then I don't see how you can get any better than a full extract each time. 不幸的是,如您所指出的,如果没有办法通过缺少时间戳或类似机制来询问源数据库的最新变化,那么我看不出有什么比完整的要好每次提取。

Now, I don't know Sybase that much, but in MS SQL Server you could potentially create another database that mirrors the first, and in this second database you could make whatever changes you need. 现在,我对Sybase的了解并不多,但是在MS SQL Server中,您可能会创建另一个数据库,该数据库可以映射第一个数据库,而在第二个数据库中,您可以进行所需的任何更改。

However, if you can make such a database in Sybase, and use SQL to access both at the same time, you might be able to run queries that produce the differences. 但是,如果可以在Sybase中建立这样的数据库,并使用SQL同时访问这两个数据库,则可以运行产生差异的查询。

For instance, something along the lines of: 例如,类似以下内容的东西:

SELECT S.*
FROM sourcedb..sourcetable1 AS S
    FULL JOIN clonedb..sourcetable1 AS C
    ON S.pkvalue = C.pkvalue
WHERE S.pkvalue IS NULL OR C.pkvalue IS NULL

This would produce rows that are inserted or deleted. 这将产生插入或删除的行。

To find those that changed, you would need this WHERE-clause: 要查找已更改的内容,您将需要以下WHERE子句:

WHERE S.column1 <> C.column1
   OR S.column2 <> C.column2
   OR ....

Since the tables are joined, the WHERE-clause would filter out any rows where the previous extract and the current state is different. 由于这些表是联接在一起的,因此WHERE子句会过滤掉先前提取的数据和当前状态不同的任何行。

Now, this might not run fast either, you would have to test to make sure. 现在,这可能也不很快,您必须进行测试以确保。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM