简体   繁体   中英

why nifi is loosing flowfile content?

I am facing a very weird issue.

This error is coming randomly in "EvaluateJsonPath 1.6.0" processor. I have 3 instance of this processor in my workflow. Error randomly comes, its not coming at same place.

Sometime flow runs fine (very rarely). this error frequent enough, but location of error is random.

flow is like this => fire http url -> eval result Json -> get more URL -> call those http urls -> do eval -> wait -> merge all the result -> write to fs -> end

wait part of code waits for approx 30 min.

each relation has enough buffer (5 gb, 100000 ff). and I can see no back pressure.

system has enough memory remaining. also jvm is running with 28gb heap.

I am on v 1.6.0

What may be the reason? is some background process, cleaning up the file, before process is releasing it?

is it possible, I may have configured some optimization which is compelling nifi to clean content folder? its not like, content folder is empty, it still has old files in it, so cant be that.

I am really confused.

I can see following stack trace

`

2018-11-14 12:04:04,120 ERROR [Timer-Driven Process Thread-2] o.a.n.p.standard.EvaluateJsonPath EvaluateJsonPath[id=d9d338ca-5396-3f8c-e134-753aacda1ca6] EvaluateJsonPath[id=d9d338ca-5396-3f8c-e134-753aacda1ca6] failed to process session due to org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile; Processor Administratively Yielded for 1 sec: org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
org.apache.nifi.processor.exception.MissingFlowFileException: Unable to find content for FlowFile
        at org.apache.nifi.controller.repository.StandardProcessSession.handleContentNotFound(StandardProcessSession.java:3104)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2228)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)
        at org.apache.nifi.processors.standard.AbstractJsonPathProcessor.validateAndEstablishJsonContext(AbstractJsonPathProcessor.java:77)
        at org.apache.nifi.processors.standard.EvaluateJsonPath.onTrigger(EvaluateJsonPath.java:271)
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.nifi.controller.repository.ContentNotFoundException: Could not find content for StandardContentClaim [resourceClaim=StandardResourceClaim[id=1542197040430-4449, container=default, section=353], offset=844526, length=142607]
        at org.apache.nifi.controller.repository.StandardProcessSession.getInputStream(StandardProcessSession.java:2167)
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2192)
        ... 14 common frames omitted
Caused by: java.io.EOFException: null
        at org.apache.nifi.stream.io.StreamUtils.skip(StreamUtils.java:242)
        at org.apache.nifi.controller.repository.FileSystemRepository.read(FileSystemRepository.java:859)
        at org.apache.nifi.controller.repository.StandardProcessSession.getInputStream(StandardProcessSession.java:2135)
        ... 15 common frames omitted

`

After searching a bit, it seems there may be two kinds of problems that could cause errors like this.

  1. Problems regarding the content of the message

This kind of problem should be reproducible, so if a message fails once in your pipeline, it should fail again when you re-run it. These problems should be easy to solve.

Things to look out for are incorrect message content, or perhaps even empty messages.

  1. Problem with the content repository

When Nifi ingests a message, it then is placed into the content repository until it needs to be touched again. The second possibility is that there is a problem with the repository.

Likely causes would be an external interference with the repository (something else writing/deleting/locking the content). Or perhaps problems on the OS level. I found this angle thanks to this HCC post: https://community.hortonworks.com/questions/231364/nifi-processing-files-as-zero-bytes.html

I am not able to help dig into this, but if you have arranged support from Cloudera (The driving force behind Nifi), the support team should be able to investigate further if needed.

Full disclosure: I am an employee of Cloudera

we encounter similar problem in our cluster. Have you found some solution? I opened a similar thread Missing flowfile exception on Nifi processing cause loss of information

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM