简体   繁体   中英

Schedule batch SQL in NiFi

I'm using NiFi to connect 2 systems:

  • Source one generating events in a Kafka topic
  • Destination one where I will only consider the Oracle database.

I need to reduce the JSON coming in the Kafka topic and push them in appropriate tables. No major issues in doing this but... The source system is generating too many events and the destination database triggers processes for every modifications. And is not sized to handle that many processes.

So I'm doing bulk update in my DB, using the PutSQL Processor behind a Text Processor + Update Attribute Processor + ReplaceText Processor (as shown here for example: https://community.hortonworks.com/articles/91849/design-nifi-flow-for-using-putsql-processor-to-per.html ).

But this workflow allows me to update my DB based on a number of elements to put in it (my batch size).

I would like to bulk update on a regular, time based, basis. Reason is that source events are not coming linearly, and destination database cannot accept being more than 5 minutes "away" from the source. So I need to schedule my bullk update at worst every 5 minutes.

I can't see right now how to do this. Please could you tell me which processors/solution you would you?

PS: Of course, tons of better solutions exist, like not triggering heavy processes on each commit in my destination database, but changing this "good old system" is not affordable right now.

Cheers, Olivier

I'd suggest using the Wait and Notify processors in tandem to set up a "gate" which holds flowfiles in a queue until the Notify processor (with a run schedule of ~5 minutes) sends the "trigger" flowfile. Koji Kawamura has written an extensive article documenting this behavior pattern .

Well... The answer is pretty simple indeed. You just need to go on the "Schedule" tab of the processor. I'm now running the 1.6.0-SNAPSHOT (by the way, it looks like this option was there for a long time... I just did not notice it) and it provides Scheduling with the ability to setup a Cron scheduler. Which perfectly answer the need...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM