简体繁体 English

从Oracle导入数据（初始加载）到Elastic Search的最佳方法

[英]Best ways to import data (initial load) from Oracle to Elastic Search

原文 2018-02-19 05:52:50 9 1 elasticsearch/ data-migration/ data-import

I am working on a project where I have two big tables (parent and child) in Oracle. 我正在一个项目中，我在Oracle中有两个大表（父表和子表）。 One is having 65 Million and the other 80 Million records. 一个拥有6500万记录，另外8000万记录。 In total, data from 10 columns are required from these tables and saved as one document into Elastic search. 这些表总共需要10列的数据，并作为一个文档保存到Elastic search中。 The load for two tables can be done separately also. 两个表的加载也可以分别完成。 What are two comparable options to move data (one time load) from these tables into Elastic search and out of the two which one would you recommend? 将数据（一次加载）从这些表移入Elastic search并从这两个表中移出，有两个可比较的选择？ The requirement is that it should be Fast and simple so that it can not only be used for one time data load but also be used in case there is a failure and the elastic search index needs to be created again from scratch. 要求它应该快速且简单，以便它不仅可以用于一次数据加载，还可以在出现故障并且需要从头开始重新创建弹性搜索索引的情况下使用。

1 个解决方案

As already suggested one option may be logstash: the advantage of logstash is simplicity, but it can be complicated to monitor and it can be difficult to configure if you have to transform some field during the ingestion. 正如已经建议的那样，logstash可能是一种选择：logstash的优点是简单，但是监视起来可能很复杂，如果在摄取期间必须转换某些字段，则可能很难配置。

One alternative can be nifi : it offers jdbc and elasticsearch plugin, and you can monitor, start and stop the ingestion directly with the web interface. 一种替代方案可以是nifi ：它提供了jdbc和elasticsearch插件，您可以使用Web界面直接监视，启动和停止摄取。 It is possible with nifi to build a more complex and robust pipeline: handling exceptions, translating data types and performing data enrichment. 使用nifi可以构建更复杂，更强大的管道：处理异常，转换数据类型和执行数据充实。