简体   繁体   中英

TailFile Processor- Apache Nifi

I'm using Tailfile processor to fetch logs from a cluster(3 nodes) scheduled to run every minute. The log file name changes for every hour I was confused on which Tailing mode should I use. If I use Single File it is not fetching the new file generated after 1 hour. If I use the multifile, It is fetching the file after 3rd minute of file name change which is increasing the size of the file. what should be the rolling filename for my file and which mode should I use. Could you please let me know. Thank you

Myfilename: retrieve-11.log (generated at 11:00)- this is removed but single file mode still checks for this file after 1 hour retrieve-12.log (generated at 12:00)

My Processor Confuguration:

Tailing mode: Multiple Files

File(s) to Tail: retrieve-${now():format("HH")}.log

Rolling Filename Pattern: ${filename}.*.log

Base Directory: /ext/logs

Initial Start Position: Beginning of File

State Location: Local

Recursive lookup: false

Lookup Frequency: 10 minutes

Maximum age: 24 hours

Sounds like you aren't really doing normal log file rolling. That would be, for example, where you write to logfile.log then after 1 day, you move logfile.log to be logfile.log.1 and then write new logs to a new, empty logfile.log .

Instead, it sounds like you are just writing logs to a different file based on the hour. I assume this means you overwrite each file every 24h?

So something like this might work?

配置

EDIT:

So given that you are doing the following:

At 10:00, `retrieve-10.log` is created. Logs are written here.
At 11:00, `retrieve-11.log` is created. Logs are now written here.
At 11:10, `retrieve-10.log` is moved.

TailFile is only run every 10 minutes.

Then targeting a file based on the hour won't work. At 10:00, your tailFile only reads retrieve-10.log . At 11:00 your tailFile only reads retrieve-11.log . So worst case, you miss 10 minuts of logs between 10:50 and 11:00.

Given that another process is cleaning up the old files, there isn't going to be a back log of old files to worry about. So it sounds like there's no need to set the hour specifically.

tailing mode: multiple files
files to tail: /path/retrieve-*.log

With this, at 10:00, tailFile tails retrieve-9.log and retrieve-10.log . At 10:10, retrieve-9.log is removed and it tails retrieve-10.log . At 11:00 it tails retrieve-10.log and retrieve-11.log . At 11:10, retrieve-10.log is removed and it tails retrieve-11.log . Etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM