简体   繁体   English

ODI:从XML文件加载数据并插入Oracle数据库

[英]ODI: Load Data from XML file & insert into Oracle Database

I'm newbie at ODI (Oracle Data Integrator) 11g. 我是ODI(Oracle Data Integrator)11g的新手。 I have XML file. 我有XML文件。 I need to load data from that XML file into Oracle Database. 我需要将数据从该XML文件加载到Oracle数据库中。 I'm created project, imported knowledge module, created XML & Oracle model. 我创建了项目,导入了知识模块,创建了XML和Oracle模型。

Note : My XML File consisted from 40+ table. 注意:我的XML文件由40多个表格组成。 这是我的界面

我创建了XML和Oracle模型

Target datastore is only stores 1 table at moment. 目前目标数据存储区仅存储1个表。

Here's my session log: 这是我的会话日志: 错误信息

Edit 编辑

Based on the comment, this is a way to load several "same xsd" xml files. 基于注释,这是加载多个“相同xsd” xml文件的方法。

First, you must ensure that all XML do have the same XSD structure, or else you might experience strange behaviour. 首先,必须确保所有XML都具有相同的XSD结构,否则可能会遇到奇怪的行为。

To process more than one XML file, basically this is what you have to do: 要处理多个XML文件,基本上这是您必须要做的:

1) You give a specific fixed XML name, and create your topology based on it. 1)提供一个特定的固定XML名称,并基于该名称创建拓扑。 2) Create a "file processing control" to rename and move the files before reading. 2)创建一个“文件处理控件”以重命名并在读取之前移动文件。 3) Make sure you do the proper "synchronizing" commands 3)确保您执行正确的“同步”命令

Example

You have XML files: 您有XML文件:

/path/in/XML001.XML
/path/in/XML002.XML
/path/in/XML003.XML

Save one of the files as XML_DATA.xml, and setup your topology with XML_DATA.xml in some '/work/' path, verify that it's all good by testing. 将其中一个文件另存为XML_DATA.xml,并在某些“ / work /”路径中使用XML_DATA.xml设置拓扑,并通过测试验证其是否良好。

Process a Loop where: 处理一个循环,其中:

1) Move /path/in/XML001.XML to /path/work/XMLDATA.XML (overwrite or delete old XMLDATA) 1)将/path/in/XML001.XML移至/path/work/XMLDATA.XML(覆盖或删除旧的XMLDATA)
2) Execute "SYNCHRONIZE FROM FILE" 2)执行“从文件同步”
3) Process your Interfaces 3)处理您的界面
4) Execute "SYNCHRONIZE FROM DB" 4)执行“从数据库同步”
5) Move processed XML to "/path/processed/" 5)将已处理的XML移至“ / path / processed /”

Your Package would be something like: 您的包裹将类似于:

[Loop] > [MoveFile] > [ProcedureSync] > [Interfaces] > [ProcedureSync] > [MoveFile] > [EndLoop] [循环]> [MoveFile]> [ProcedureSync]> [接口]> [ProcedureSync]> [MoveFile]> [EndLoop]

About the Loop control, there is a few ways to do it, if you have doubts I can send you tips. 关于Loop控件,有几种方法可以实现,如果您有疑问,我可以给您发送提示。

Hope that helps! 希望有帮助!

Edit 2 编辑2

Based in the new information I will try to send a better explanation about the questions asked. 根据新信息,我将尝试对所提出的问题进行更好的解释。 This is not such a difficult mission, but to someone who is still understanding ODI tool it may sound hard. 这并不是一个艰巨的任务,但是对于仍然了解ODI工具的人来说,听起来可能很难。

The main point is to understand that for ODI a XML file is a data source just like a Database, and not a file like a .csv. 要点是要了解,对于ODI而言,XML文件就像数据库一样是数据源,而不是.csv这样的文件。

  • (Q) How do i create File Processing Control? (Q)如何创建文件处理控件?
  • (A) What I called "file processing control" is a simple mechanism where you "move/copy/delete" your XML files through the folders. (A)我所谓的“文件处理控件”是一种简单的机制,您可以在其中通过文件夹“移动/复制/删除” XML文件。 You can accomplish that by using the Package tools OdiFileCopy, OdiFileDelete, etc. 您可以使用打包工具OdiFileCopy,OdiFileDelete等来完成此操作。
  • (Q) I'm not understand enough about synchronizing command. (Q)我对同步命令还不够了解。 Can you give me more details? 你能给我更多细节吗?
  • (A) The synchronizing is necessary when using XML files in ODI. (A)在ODI中使用XML文件时,必须进行同步。 Basically when ODI uses a XML file, first it loads it to memory. 基本上,当ODI使用XML文件时,首先将其加载到内存中。 Then ODI creates a .lck file that locks the XML file. 然后,ODI创建一个.lck文件来锁定XML文件。 When you finish your Package the XML file is still in memory, so you need to "download" it again to the file and release the lock, and ODI does not do that by it self. 完成打包后,XML文件仍在内存中,因此您需要再次将其“下载”到文件中并释放锁,ODI本身不会这样做。 This is because you should be able to run as many packages as you see fit in your loading process, with the XML file still available. 这是因为您应该能够在加载过程中运行尽可能多的软件包,而XML文件仍然可用。 So when you're finished you must inform that you will not use the XML anymore, by running the synchronizing from database command. 因此,完成后,必须通过运行从数据库同步命令来通知您不再使用XML。
  • Should i create XML Technology topology or File Technology topology? 我应该创建XML技术拓扑还是文件技术拓扑? or Both? 或两者?
  • (A) You don't need to create topology for File, to run the package file tools. (A)您无需为File创建拓扑,即可运行包文件工具。 Just keep in mind that you must create your XML topology for a "Generic" file. 请记住,您必须为“通用”文件创建XML拓扑。 Instead of setting your topology to OM135SVOD180624.xml, you should set it to OMDATASOURCE.xml 不要将拓扑设置为OM135SVOD180624.xml,而应将其设置为OMDATASOURCE.xml
  • I need more details on Loop Control 我需要有关循环控制的更多详细信息
  • (A) I will make a more detailed description. (A)我将做更详细的描述。 Very unfortunately I dont have an ODI installed here right now, or else I would also post an example. 非常不幸的是我现在没有在这里安装ODI,否则我也将发布一个示例。 But I think it will be easy to understand. 但是我认为这很容易理解。

1) Example of using Synchronizing when working with XML files. 1)处理XML文件时使用同步的示例。

Whenever accessing a XML file for reading or writing, it's recomended to proceed as folow: 每当访问XML文件进行读写时,建议按照以下步骤进行操作:

  • Run "SYNCHRONIZE FROM FILE" command, in the Logical Schema of the XML file. 在XML文件的逻辑模式中运行“ SYNCHRONIZE FROM FILE”命令。 You can do this by creating an ODI Procedure, setting the technology to XML, pointing the Logical Schema to the one you created, and writing in the "command on target" window: SYNCHRONIZE FROM FILE. 您可以通过创建ODI过程,将技术设置为XML,将逻辑模式指向您创建的逻辑模式并在“目标上的命令”窗口中写入:SYNCHRONIZE FROM FILE来实现。
  • Run your ODI packages or load plans that read or write the XML file. 运行您的ODI软件包或加载读取或写入XML文件的计划。
  • Run "SYNCHRONIZE FROM DATABASE" command, in the Logical Schema of the XML file. 在XML文件的逻辑模式中运行“ SYNCHRONIZE FROM DATABASE”命令。 You can do this by creating an ODI Procedure, setting the technology to XML, pointing the Logical Schema to the one you created, and writing in the "command on target" window: SYNCHRONIZE FROM DATABASE. 您可以通过创建ODI过程,将技术设置为XML,将逻辑模式指向您创建的逻辑模式并在“目标上的命令”窗口中进行写入来完成此操作:SYNCHRONIZE FROM DATABASE。
  • The full reference for this is very easy to find here: https://docs.oracle.com/cd/E28280_01/integrate.1111/e12644/xml_file.htm#ODIKM534 in the item 5.6.1.2. 此处的完整参考非常容易找到:项目5.6.1.2中的https://docs.oracle.com/cd/E28280_01/integrate.1111/e12644/xml_file.htm#ODIKM534

Example: 例: 在此处输入图片说明

2) File Processing Control 2)文件处理控制

In the Package, inside the Toolbox bar, there is a "File" toolbar that gives you a lot of useful File tools, like copy, delete, move, zip, unzip, etc. This is useful when you need to "control" the files you are reading, renaming, etc. 在“工具包”工具栏中的“包”中,有一个“文件”工具栏,可为您提供许多有用的文件工具,例如复制,删除,移动,压缩,解压缩等。当您需要“控制”工具时,此工具很有用。您正在读取,重命名等文件

You can play a little with these tools, they are very easy to understand. 您可以使用这些工具进行一些操作,它们非常易于理解。

3) Loop Control 3)回路控制

In ODI you can develop a loop using variables in a package. 在ODI中,您可以使用包中的变量来开发循环。 Sometimes you can process loops just by using the odi Procedure, it depends on your needs. 有时,您可以仅使用odi过程来处理循环,这取决于您的需求。 Based in the few information I have about your context, I would suggest you try using variables first. 根据我所掌握的有关您的上下文的一些信息,建议您首先尝试使用变量。

So here is an example that process a loop 5 times. 因此,这是一个处理5次循环的示例。

1) Create a Variable as Number. 1)创建一个数字变量。
2) Drag it to the package and set the value to 0. 2)将其拖到包中并将其值设置为0。
3) Drag it again, configure it as Evaluation, set the condition to Equal to 5. 3)再次拖动它,将其配置为评估,将条件设置为等于5。
4) Drag any "test" interface that you have. 4)拖动任何具有的“测试”界面。
5) Link the KO Link to the Interface. 5)将KO Link链接到接口。 When using evaluating variables, KO links acts like the "false" condition. 使用评估变量时,KO链接的行为类似于“假”条件。 In our case, it will point to the interface if the counter is < 5. 在我们的例子中,如果计数器<5,它将指向接口。
6) Drag the variable again, this time instead of setting a fixed value, you will set an incremental by 1. This will add 1 to the value. 6)再次拖动变量,这一次不用设置固定值,而是将增量设置为1。这将为该值加1。
7) Link this last variable to the Evaluate variable. 7)将最后一个变量链接到评估变量。

So, this will run 5 times your test interface. 因此,它将运行5次测试界面。 I found an image in the internet that ilustrates this: 我在互联网上发现一张图片,说明了这一点:

ODI中的循环示例

You can find useful information here " https://dzone.com/articles/odi-11g-implementing-loops " and " https://blogs.oracle.com/dataintegration/using-variables-in-odi:-creating-a-loop-in-a-package ". 您可以在以下位置找到有用的信息:“ https://dzone.com/articles/odi-11g-implementing-loops ”和“ https://blogs.oracle.com/dataintegration/using-variables-in-odi:-creating-包装中的一个循环 ”。

Final Package 最终包装

In the end of the day, your "algorithm" would be something like this: 归根结底,您的“算法”将如下所示:

1) Process the Loop through the Files in the folder (this is a little tricky, it may need a jython code, you can find a reference to do that here " https://blogs.perficient.com/2014/08/01/looping-through-files-in-a-folder-using-odi/ ") 1)处理文件夹中文件的循环(这有点棘手,它可能需要一个jython代码,您可以在此处找到相关的参考资料“ https://blogs.perficient.com/2014/08/01 /通过使用odi /的文件夹中的文件循环

2) Move the first file to /work/OMDATASOURCE.xml 2)将第一个文件移至/work/OMDATASOURCE.xml

3) SYNCHRONIZE FROM DATABASE command. 3)SYNCHRONIZE FROM DATABASE命令。

4) Process your Interfaces 4)处理您的界面

5) SYNCHRONIZE FROM FILE command. 5)SYNCHRONIZE FROM FILE命令。

6) Move OMDATASOURCE.xml to '/processed' or any other control you create. 6)将OMDATASOURCE.xml移至“ / processed”或您创​​建的任何其他控件。

7) Process Next File. 7)处理下一个文件。 You may also want to control some stuff using tables, like files readed, files processed, etc. 您可能还希望使用表来控制某些内容,例如读取的文件,处理的文件等。

8) End Loop (using the loop control examples I sent). 8)结束循环(使用我发送的循环控制示例)。

ODI is very flexible and extensible, you can do everything, in a lot of ways. ODI非常灵活且可扩展,您可以通过很多方式来做所有事情。

Considerations 注意事项

  • There are other ways to do this. 还有其他方法可以做到这一点。 You can set a variable in the name of the XML file. 您可以在XML文件的名称中设置一个变量。 This would prevent the need for "move and rename", but you will still need to process the loop over the files in the path to get the parameters for the files you need, or at least to have them in a table so you can process the loop and change the values. 这样可以避免“移动并重命名”的需要,但是您仍然需要处理路径中的文件循环,以获取所需文件的参数,或者至少将它们放在表中以便可以处理循环并更改值。

Hope this will help, []'s 希望这会有所帮助,

Cheers 干杯

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM