简体   繁体   中英

Apache Nifi ExecuteSQL Processor

我正在尝试使用ExecuteSQL处理器从oracle数据库中获取数据。我有一些查询,比如我的oracle数据库中有15条记录。当我运行ExecuteSQL处理器时,它将作为流处理连续运行并将整个记录存储为HDFS中的单个文件并重复执行相同操作。因此,hdfs位置中将有许多文件将从oracle db获取已经获取的记录,这些文件包含相同的数据。如何使这个处理器在这样的数据库中运行它必须从oracle db获取所有数据一次并存储为单个文件,并且当新的记录插入到db中时,它必须将它们摄取到hdfs位置?

Take a look at the QueryDatabaseTable processor:

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.QueryDatabaseTable/index.html

You will need to tell this processor one or more columns to use to track new records, this is the Maximum Value Columns property. If your table has a one-up id column you can use that, and every time it runs it will track the last id that was seen, and start there on the next execution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM