简体   繁体   English

针对多个数据库系统测试SQL查询

[英]Testing SQL queries against multiple database systems

I'm involved in a migration project from Oracle to PostgreSQL, and I'm looking for a way to automate the testing of a large number of queries converted from Oracle syntax into the PostgreSQL one. 我参与了一个从Oracle到PostgreSQL的迁移项目,并且我正在寻找一种方法来自动测试从Oracle语法转换为PostgreSQL的大量查询。 The assumption is that the data has been migrated successfully, so there is no need to check that. 假设数据已成功迁移,因此无需检查。 I can hack a solution from scratch using Perl or Python, but there might be easier ways. 我可以使用Perl或Python从头开始破解解决方案,但是可能会有更简单的方法。 I was looking at the database testing frameworks, lke Test::DBUnut or pgTap, but they assume that a user supplies results to verify against, and in my case these are obtained from the database we are migrating from. 我当时在看数据库测试框架,如Test :: DBUnut或pgTap,但是它们假设用户提供了要验证的结果,而在我的情况下,这些结果是从我们要从其迁移的数据库中获得的。 A question is, is there an existing database-specific tool or testing framework to execute queries against old (Oracle) and new (PostgreSQL) databases, get the results and compare them, highlighting the differences and any errors that might occur in the process? 问题是,是否存在现有的特定于数据库的工具或测试框架来对旧(Oracle)数据库和新(PostgreSQL)数据库执行查询,获取结果并进行比较,从而突出显示差异以及过程中可能发生的任何错误?

How about creating JUnit project that runs the corresponding query on different schemas (one Oracle the other PostgreSQL)? 如何创建在不同模式(一个Oracle,另一个PostgreSQL)上运行相应查询的JUnit项目?

Alternatively, you could create two simple Maven projects (one per each vendor) each project will use an SQL Plugin in order to run your queries (paste them in the same order into the pom.xml). 或者,您可以创建两个简单的Maven项目(每个供应商一个),每个项目将使用SQL插件来运行查询(以相同的顺序将其粘贴到pom.xml中)。 You can later automate these tests by using continuous integration server that supports Maven (Hudson?) and set a scheduled execution. 您以后可以使用支持Maven(Hudson?)的持续集成服务器来自动执行这些测试,并设置计划的执行时间。

Good luck! 祝好运!

I ended up writing a custom tool to run queries against both databases and collect results using python psycopg2 and cx_oracle. 我最终写了一个自定义工具来对两个数据库运行查询,并使用python psycopg2和cx_oracle收集结果。 Comparing them is a matter of calculating hashes for each row and checking whether the oracle row exists in the hash of the postgresql rows . 比较它们只是计算每行的哈希值,并检查oracle行是否存在于postgresql行的哈希中。 A couple of pitfalls: 有两个陷阱:

  • floating point numbers can loose precision when converted from Oracle/PostgreSQL to python. 从Oracle / PostgreSQL转换为python时,浮点数会降低精度。 Use type specific hooks in the drivers (see documentation) to make sure you convert them to Decimal, not float. 在驱动程序中使用特定于类型的挂钩(请参阅文档),以确保将它们转换为十进制,而不是浮点数。

  • it's tempting to just read one row at a time from both databases, compare its values and move on. 试图一次只从两个数据库中读取一行,比较它的值然后继续前进,这很诱人。 However, that won't work, unless the SQL result is explicitly ordered (with ORDER BY). 但是,除非将SQL结果显式排序(使用ORDER BY),否则将无法使用。 Unfortunately, reading the results all at once means that you need a lot of memory for queries producing lots of rows. 不幸的是,一次读取所有结果意味着您需要大量内存来进行产生大量行的查询。

  • one needs to distinguish between queries producing equal results and those producing 0 rows on both databases. 需要区分在两个数据库上产生相等结果的查询与产生0行的查询。 The latter should be examined and if the queries contain parameters, their values should be revised. 应该检查后者,如果查询包含参数,则应修改其值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM