简体   繁体   English

如何直接读取Oracle数据泵二进制转储文件?

[英]How to read Oracle data pump binary dump file directly?

For performance and other reasons, I am looking for a way to directly parse the binary file format of a data pump dump file. 出于性能和其他原因,我正在寻找一种直接解析数据泵转储文件的二进制文件格式的方法。

The data pump utility "impdp" works only on the database server host, not on the DB client host. 数据泵实用程序“ impdp”仅在数据库服务器主机上有效,而在DB客户端主机上无效。 In order to run it you have to send the whole dump file from DB client to DB server host and then run "impdp" using SSH. 为了运行它,您必须将整个转储文件从数据库客户端发送到数据库服务器主机,然后使用SSH运行“ impdp”。

Sometimes, like if you want only to get a list of schemas or tables included in the dump file, sending a huge file to remote host is non-sense. 有时,例如,如果您只想获取转储文件中包含的架构或表的列表,则将大型文件发送到远程主机是没有意义的。

I am looking for a library (preferred in Java) or a format specification describing the dump file in order to write code to parse it locally, without the help of the official "impdp" utility. 我正在寻找一个描述转储文件的库(在Java中首选)或一种格式说明,以便编写代码以在本地解析该文件,而无需借助官方的“ impdp”实用程序。

Thanks. 谢谢。

UPDATE: 更新:

I use the following regular expression to filter the dump file to find table names: 我使用以下正则表达式来过滤转储文件以查找表名称:

^[\\x32-\\x7e\\s]{4,}.*</OWNER_NAME><NAME>([^<]*)</NAME>.*

The expression [\\\\x32-\\\\x7e\\\\s] means printable ASCII characters including white spaces. 表达式[\\\\x32-\\\\x7e\\\\s]表示可打印的ASCII字符,包括空格。 This filters out the binary lines. 这将滤除二进制行。

The expression {4,} means at least 4 characters. 表达式{4,}表示至少4个字符。

Since I am dealing with XML, I am extracting the "NAME" element that comes directly after "OWNER_NAME" element. 由于我正在处理XML,因此我提取了紧接在“ OWNER_NAME”元素之后的“ NAME”元素。 Maybe this way is not that elegant but it seems to work. 也许这种方式不是那么优雅,但似乎可行。

Please comment if this way helped you. 如果这种方式对您有所帮助,请发表评论。

  • impdp data format is proprietary impdp数据格式是专有的
  • you can also use older version of the tool imp/exp, which also works remotely. 您还可以使用旧版的工具imp / exp,它也可以远程运行。 But this is not so fast due to network round trips 但这由于网络往返而没有那么快
  • you can also export data by our own tool in into flat file, and then use sqlldr (with direct path insert) 您还可以通过我们自己的工具将数据导出到平面文件中,然后使用sqlldr(带有直接路径插入)
  • you can also put the dump file on NFS share and then let Oracle access it via NFS 您还可以将转储文件放在NFS共享上,然后让Oracle通过NFS访问它

Using Java/JDBC for huge data manipulation is not good idea. 使用Java / JDBC进行大量数据操作不是一个好主意。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM